The robot control system complies with the three-layer control architecture paradigm. Its lowest layer provides necessary hardware abstraction, and integrates low-level motion controllers, sensor systems and algorithms implemented as external software. The middle layer is responsible for the functions of the robot and the implementation of his competencies. It defines a set of tasks the robot will be able to perform. The highest layer may incorporate a dedicated decision system, a finite-state machine or a comprehensive system simulating human mind functionalities.
Fig. 1 Three-layer control architecture
Urbi development platform is used as the main tool for handling the various software modules. It integrates and provides communications between the two lowest levels of the architecture. This allows dynamic loading of modules called UObjects which are used to bind hardware or software components, such as actuators and sensors on the one hand and voice synthesis or face recognition algorithms on the other hand. Urbi also delivers urbiscript – a script programming language for use in robotics, oriented towards parallel and event-based programming.
Fig. 2 Three-layer control architecture
Due to the flexibility of the control architecture, modules can span more than one layer. This happens most often in the case of components that use external libraries that provide both low-level drivers and competencies that belong in the middle layer. For example, OpenNI software can be used to retrieve data from an RGB-D sensor (lowest layer) and provide data on silhouettes detected by the sensor (middle layer).
The highest layer of the control architecture hosts the robot decision system. During short-term HRI, the role of this system is often played by finite-state machines. They are sufficient for creating interesting interaction scenarios, but after a couple of minutes people notice that the robot is repeatable and previous events are not affecting his behavior. During short-term studies, when the robot's behavior cannot be obtained in autonomous operation, the decision system can be assisted by a human operator. This approach is called the Wizard of Oz (WoZ). In order for FLASH to fulfill the requirements set for social robots, he should be equipped with some sort of affective mind. This mind should consist of a rational component, enabling the robot to plan his action and achieve his goals, and an emotional component, which would simulate his emotions and produce reactive responses. The role of an emotional component in HRI is crucial. It influences the control system, changing the perceptions and goals based on simulated emotions. Emotions also provide reliable and non-repetitive reactions, and increase the credibility of a social robot's behaviors.
Two emotional systems - Wasabi by Becker Asano and a dynamic PAD-based model of emotion have been adapted to the control system. Both systems are rooted in dimensional theories of emotion.
The Urbi engine, operating in the main thread, runs the low-level modules synchronously using the data and functions that they provide. Modules which consume a significant amount of CPU time can be run asynchronously in separate threads. The thread, that the Urbi engine runs in, will then not be blocked and the engine will be able to effectively perform other tasks in the background. Urbi is thread-safe since it provides synchronization and extensive access control for tasks that are run in separate threads. It is possible to control access to resources at the level of modules, instances created within the system or single functions. An example could be the detection of objects within an image. Urbi can run the time-consuming image processing process in a separate thread leaving the other operations (e.g. trajectory generation) unaffected. Components that use the data that are the result of image processing will be waiting for the results of the module's operation. The above mechanism meets the criteria of a soft real-time system.
Fig. 3 Video workflow
Hardware robot drivers whose operation is critical (e.g. balancing controller) are implemented on microprocessors using a lightweight hard real-time operating system. The competencies of the robot and his specific behaviors are programmed using urbiscript by loading instructions into the Urbi engine by a client application.
Urbiscript is a script programming language which serves as a tool for management and synchronization of various components of the control system. Urbiscript syntax is based on well-known programming languages and urbiscript itself is integrated with C++ and many other languages such as Java, MATLAB or Python. Of particular interest is the orchestration mechanism, built into Urbi, which handles among others the scheduling and parallelization of tasks. Thanks to this feature all the activities of the robot can be synchronized with each other, e.g. movements of joints during head and arm gesticulation, mouth movement with speech, tracking of objects detected in the camera image, etc. The programmer decides how the various tasks should be scheduled through the use of appropriate urbiscript instruction separators. Components with an UObject interface are supported by the urbiscript.
Urbiscript possesses an important feature that allows the programmer to assign tags to pieces of code. This means that some tasks can be grouped and managed together which in turn can be used to implement task prioritization. This mechanism helps to avoid conflicts that may occur during access to the physical components of the robot. With it, the programmer can stop and resume fragments of instructions at any time and also implement resource preemption. The process of generating facial expressions during speech can be used as an example. Generating a smile utilizes all the effectors installed in the robot's head. The movement of each drive is tagged. Speech has a higher priority, and therefore when the robot speaks, the tag encompassing jaw (or mouth) trajectory generation is stopped for the purpose of generating speech-related mouth movements. When the robot stops speaking the operation of the trajectory generator will resume.
The designed control system enables accessing the robot hardware and competencies in a unified manner - using a tree structure called robot (Fig. 4). It makes using the API more convenient and helps to maintain the modularity of software. Elements of the robot structure have been grouped based on their role. Example groups include audio, video, ml (machine learning), body (platform control), arm, hand, head, dialogue, network and appraisal. Thanks to this modularity, various robot components can be easily interchanged, e.g. EMYS head can work just as well when mounted on a platform other than FLASH. The software also allows for quick disconnection of missing or faulty robot components. Moreover, some functions like gesture generations are parameterized (with respect to duration, intensity, mode, etc.) so that their final form can be adjusted to current situation (e.g. depending on the emotional state of the robot).
Fig. 4 Robot API
Elements of the structure robot can be accessed by inserting the names of suitable branches preceded by a dot,
robot.identity.name; // get robot name robot.body.arm.hand.Open(1s); // open hand for 1 sec. robot.body.neck.head.Smile(3s); // smile for 3 sec. robot.body.x.speed=0.5; // set longitudinal platform speed to 0.5 m/s robot.body.laser.getClosest(-20,20); // get the angle with the nearest object robot.audio.speech.recognition.result; // get speech recognition result phrase robot.video.camera.image; // get image from camera robot.video.humanDetector.position; // get human xyz position robot.video.rightHandColorDetector.value; // get color of the object held in the hand robot.ml.colorLearning.LearnFromRightHand("red"); // teach robot colors robot.network.weather.condition.temperature; // get the temperature from weather forecast robot.network.mail.Check(); // check if there are new emails robot.network.facebook.Post("me","Hello world!"); // post a message on Facebook
It is worth mentioning that the interface activates functions in a device-independent way. A call to the following function
results in the generation of a smile for 3s, regardless of the type of the head used. Another example is calling the function
In the case of a robot with lips the utterance will be accompanied by a movement of the lips, actuated by servomotors, while for EMYS only a motion of the lower disc will be executed. Obviously, if the current robot configuration does not contain any head, the system will inform that it is lacking. Furthermore, the "robot" structure allows to make use of the so called localizators. Calling the instruction
results in raising up both robot arm within 5s. Similarly, calling
causes raising up only the left arm. Also, the competencies can be accessed in a uniform way. For instance, calling the instruction
robot.body.x.speed = 0.5;
sets the linear velocity of the mobile platform to 0.5m/s, regardless of whether it is the Pioneer platform or FLASH, and even whether the platform uses software navigation modules based on Aria or Player.
Full "robot" tree structure can be found in the documentation section.