11. A new paradigm?
The approach to regard emotions as a characteristic of the architecture of an intelligent system led in the last years to an increased interest in this topic. While in 1997 only two papers dealing with the topic "emotion" were presented at the leading congresses dealing with agents, 1998 already saw the first congress exclusively dedicated to "emotional agents"
A number of researchers have started to develop emotional autonomous agents based on the principles of Simon, Toda, and Sloman. Many of these approaches still are in the stage of theoretical exploration; but some have already been rudimentarily implemented.
A fundamental factor shared by all agent-centered approaches is the view of emotions as control signals in an architecture which must possess a system that can move independently in an uncertain environment. The function of emotions is it to direct the attention of the system toward an external or internal aspect which possesses meaning for substantial goals or concerns of the system and to assure it of processing priority.
Velásquez (1997) developed a model based on the "Society of Mind" theory of Minsky (1985). He calls it Cathexis , a term which he defines as "concentration of emotional energy on an object or idea " (Velásquez, 1997, p.10).
Emotions consist in his model of a variety of subsystems:
Each of these proto-specialists has four kinds of sensors, which are responsible for the measurement of internal and external states: Neural sensors, sensorimotor sensors, motivational sensors, and cognitive sensors. In addition, each proto-specialist is characterized by two threshold values which Velásquez calls Alpha and Omega: Alpha is the threshold above which the activation of the respective proto-specialist begins; Omega is the saturation limit of a proto-specialist. Finally, each proto-specialist has a decay function which affects the duration of its activation.
Velásquez differentiates in his model between basic emotions and emotion blends/mixed emotions. For the definition of basic emotions he builds on Ekman and Izard and defines them as follows:
The basic emotions in Cathexis are Anger, Fear, Distress/Sadness, Enjoyment/Happiness, Disgust and Surprise.
Emotion blends or mixed emotions, respectively, are emotional states which arise when several different proto-specialists representing the basic emotions are active without one of them dominating the others.
Finally, his model contains moods which differ from emotions only by the value of their excitation level.
Emotions in Cathexis are caused by cognitive and non-cognitive elicitors which originate from the same categories as the sensors of the system. The cognitive elicitors for the basic emotions are based on a modified version of Roseman's emotion model.
The intensity of an emotion in Velásquez ' model is affected by several factors:
The behaviour repertoire of the system knows three substantial elements: an expressive component with whose assistance it communicates its present emotional condition, consisting of face, body, and voice; an experiential component which learns from experiences and affects motivations and action tendencies of the system as well as an action selection mechanism, which selects from the calculated behavior values of different action alternatives the one with the highest value.
The system regularly goes through so-called update cycles in which the following cycle is completed:
Velásquez has implemented Cathexis in a computer model which he calls "Simón the Toddler". The screen shows the face of a baby which is capable of different emotional modes of expression and rudimentary verbalizations. The user interacts with the system by, for example, changing the parameters of Simó's proto-specialist, varying the level of neurotransmitters, or interacting directly with it by feeding it, stroking it etc..
At present, the model consists of 5 drive-proto-specialists (hunger, thirst, temperature regulation, fatigue, interest) and a repertoire of 15 behaviour alternatives, among them sleeping, eating, drinking, laughter, crying, kissing, and playing with toys. These are to be extended step by step in the course of the further development of the model.
Yuppy is a robot which represents an emotional pet. It is an advancement of the model Simón the Toddler. Yuppy was developed first as a virtual simulation before it received a body. Velásquez calls it an example of a system with emotion-based control .
The model is constructed from a number of computational units which consist of three main components: an input, an assessment mechanism and outputs. A substantial part of the assessment mechanism are the Releasers. They filter sense data and identify special conditions, according to those they then send excitatory or inhibitory signals to the subsystems connected with them.
Velásquez follows Damasio and LeDoux and differentiates between natural and learned Releasers . Natural Releasers are firmly built into the system (hard-wired); Learned Releasers are learned and represent stimuli which are associated with the occurrence of Natural Releasers or can predict their occurrence. In the language of other models, the Natural Releasers correspond to the primary emotions, while Learned Releasers are identical with secondary emotions. The latter require more processing capacity and are more complex, since they are based, among other things, on personal emotional memories which must be activated.
Drives in Yuppy are motivational systems which propel the agent into action. Drive systems are clearly distinct from emotion systems.
Yuppy's emotion systems represent six groups of affective basic reactions: Anger, Fear , Distress/Sadness , Enjoyment/Happiness , Disgust and Surprise . Velásquez differentiates between cognitive and non-cognitive Releasers of emotions. He differentiates between four groups:
Yuppy's perception system consists of two color CCD cameras as eyes; a stereo audio system with 2 microphones as ears; infrared sensors for the discovery of obstacles; an air pressure sensor in order to simulate contacts; a Pyrosensor which notices changes of the ambient temperature if humans enter the area as well as a simple proprioceptive system.
Yuppy's drive system contains four drives: Charging adjustment, temperature adjustment, fatigue, and curiosity. Each of these drives controls an internal variable assigned to it which represent the charge of the battery, the height of the temperature, the quantity of energy and the measure of the interest of the agent, respectively.
Yuppy's emotion production system consists of emotional systems with Natural Releasers for the basic emotions. Velásquez divides the emotional systems into three groups:
Yuppy's behaviour system consists of a distributed net of approximately 19 different kinds of behaviour which cover predominantly the satisfaction of its needs and the interaction with humans. Examples of such behaviour are " search for bone ", " approach bone ", " recharge battery " or " approach human ".
Like the drive systems and the emotional systems, Yuppy's behaviour systems also have their own Releasers .
The user can control Yuppy's affective style by the manipulation of parameters such as threshold values, inhibitory or excitatory connections etc.. In addition, he can present to the robot internal and external stimuli. Velásquez describes the result as follows:
Furthermore, Yuppy is able to learn secondary emotions which are stored as new or modified cognitive Releasers. If, for example, a human holds a bone in his hand and makes Yuppy to come and get it, he can stroke or discipline it afterwards. Depending upon experience, Yuppy produces a positive or negative emotional memory regarding humans which then affects its following behaviour.
Foliot and Michel (1998) define emotions as an "evaluation system operating automatically either at the perceptual level or at the cognition level, by measuring efficiency and significance" (Foliot and Michel, 1998, p. 5). For them, emotions are the basis of every cognition. With their model they aim to show "how emotion based structures could contribute to the emergence of cognition by creating suitable learning conditions" (Foliot und Michel, 1998, p. 1).
The model was implemented in a virtual Khepera robot. Khepera is a miniature robot model that contains a number of sensors and can be extended depending upon requirement by further components. The "Webots simulator" does not only simulate a Khepera; programs developed with Webots can be transferred directly into a Khepera.
The environment of the virtual Khepera consists of a city with buildings, a river and green areas. Each of these elements possesses a specific colour. The robot must move through the city and learn to evade different kinds of obstacles.
Foliot and Michel represent emotions on two levels. The level of process can evaluate stimuli and elicit different emotions; the level of state can supply informations about the system.
For Foliot and Michel, the basis of their first experiment was the assumption that an emotion is characterized by a reaction to a positive or negative signal. The model consists of four components:
The experiment resulted in the fact that the robot collided gradually less and less with obstacles but could never move completely error free. In order to examine whether the improvement of the training system by affective signals would furnish better results, the authors performed a second experiment.
The second experiment was based on on the emotion theory of Scherer. It consists of five components:
Fig. 17: Controller model by Foliot and Michel (after Foliot and Michel, 1998, p. 4)
The model differentiates between cognitive and emotional processes. Each emotional process is defined by an assessment sequence which classifies stimuli according to the criteria novelty, pleasantness, goal significance, and coping. Each stage of this process uses the results of the preceding stages as input. Coping knows the alternatives "reaction possibility" and "no reaction possibility".
The cognitive processes know a primary goal (forward movement) and four secondary goals (anti-clockwise rotation, clockwise rotation, left wall follow, right wall follow). Each goal is defined by a value in the body representation.
Learning happens in this model whenever the average state of the system contains a strong displeasure value:
Central component of the model is the mechanism which produces schemata. The experiment showed that during the avoidance of obstacles this took place either on the sensorimotor level, if an obstacle was detected by the infrared sensors, or on the schematic level, if an obstacle was not detected. The schematic level corresponds with a temporary goal change which the authors interpret as consequence of a danger signal or of an internal assessment process.
Concerning the learning process, the system exhibited two fundamental instabilities in its behavior: Either the robot persisted in its once selected goal or it changed its goals nonstop. Foliot and Michel conclude, nevertheless, that their approach is in principle correct but requires a more detailed definition of the individual components.
Gadanho and Hallam examined which role emotions play in an autonomous robot which adapts to its environment by reinforcement learning (Gadanho and Hallam, 1998). For this purpose they worked with a simulated Khepera robot.
They built their emotion model after the somatic marker hypothesis suggested by Damasio (1994). Damasio assumes that emotions cause special body feelings. These body feelings are the result of experiences with internal preference systems and external events and help to predict results of certain scenarios. Somatic markers help humans to make fast decisions without using a high processing capacity and a long time.
The model developed on this basis by Gadanho and Hallam knows four fundamental emotions: Happiness, Sadness, Fear, and Anger. The intensity of each emotion is determined by the internal feelings of the robot. These feelings are: Hunger, Pain, Restlessness, Temperature, Eating, Smell, Warmth, and Proximity. Each emotion is defined by a set of constant feeling dependencies and a bias value. For example, the intensity of Sadness is high, if Hunger and Restlessness are high and the robot does not eat.
In the model of Gadanho and Hallam, each emotion tries to affect the body state in such a way that the resulting body state resembles the one which causes that specific emotion. To achieve this, the emotion uses a simple hormoneal system. With each feeling, a hormone is associated. The intensity of a feeling is derived not directly from the value of the body perception, which causes the feeling, but from the sum of the perception and the hormone value:
The hormone values can rise fast; however, they fade away slowly, so that the emotional state remains for some time, even if the emotion-releasing situation is already long past.
The robot equipped with this emotion system has the task to visit sources of food scattered in its environment and to take up energy. The faster it moves, the more energy it uses. The sources of food consist of lights which the robot can detect. In order to draw energy from it, it must push the source of food. This sets free energy for a short time, and a smell which the robot can detect. In order to take up the energy, the robot must turn around and turn its back to the source of food. After a short time the source of food is empty and needs a certain period of time to regenerate again. The robot must thus visit other sources of food. If a source of food has no energy, its light goes out.
In the context of this task, the emotional dependencies of feelings look as follows:
The system learns by Reinforcement Learning. In order to shorten the learning process, the fundamental behaviours of the robot were programmed from the start, so that the system could concentrate on the learning of behaviour co-ordination. The three fundamental behaviours of the robot are the avoidance of obstacles, approaching sources of light as well as driving along a wall.
The system has a controller with two separate modules. The Associative Memory Module is a neural net which associates the feelings of the robot with the values it expects from each of its three behaviours. The algorithm used here is Q-Learning. The Behaviour Selection Module makes a stochastic selection, based on the information of the other module, which behaviour is to be executed next.
Fig. 18: Adaptive controller (after Gadanho and Hallam, 1998, p. 3)
Reward or punishment with an autonomous robot pose, according to the authors, a special problem. From moment to moment the environment or the internal state of the robot change. If during each transition all information is analyzed and the behaviours are changed, this would cost not only enormous processing capacity, but would also supply the robot with no feedback whether a selected behaviour leads to success perhaps only after a set of transitions. On the other hand, it must be able to change a dysfunctional behaviour fast. Here the emotions come into play: Their task is to determine these state transitions.
In order to test this hypothesis, the authors developed a controller with emotion-dependent event detection. An event is detected if one of three conditions occurs:
Fig. 19: Emotions and control (after Gadanho and Hallam, 1998, p. 4)
To test the effectiveness of this event-directed controller, the authors developed three further controllers:
Each of these four controllers went through an identical experimental setup. It consisted of thirty different attempts with three million learning steps. In each attempt, a fully loaded robot was placed at a randomly selected initial position. For evaluation purposes, units with 50,000 steps each were evaluated and data collected over the following variables:
The result looks as follows:
Tab. 9: Results of the experiments of Gadanho and Hallam (after Gadanho and Hallam, 1998, p. 5)
The results show, according to the authors, that the learning controllers have fulfilled their task. Their energy level is, on average, significantly lower, but reaches no critical value. Between the two learning controllers the main difference lies in the number of collisions - here the event-directed controller is better.
In all, the event-directed controller is not significantly better than its competitor, but it obtains its learning success with a significantly lower number of events and thereby saves substantially more time.
The authors come to the conclusion that the experiments have confirmed their hypothesis about the role of emotions in reinforcement learning.
Staller and Petta developed the TABASCO architecture, an acronym for "Tractable Appraisal-Based Architecture for Situated Cognizers" (Staller and Petta, 1998). TABASCO is based to a large extent on the emotion theory of Scherer and has so far not been implemented in a simulation.
Staller and Petta understand emotions as processes which are related to the interaction of an agent with its environment. "In particular, TABASCO models the appraisal process, the generation of action tendencies, and coping." (Staller and Petta, 1998, p. 3)
The fundamental idea of TABASCO consists of the fact that the levels of the emotion system (sensorimotor, schematic and conceptional), postulated by Scherer, have not only validity regarding appraisals, but also regarding action generation. The two main components of the architecture, Perception and Appraisal and Action, are therefore constructed as hierarchies with three levels.
Fig. 20: TABASCO architecture (after Staller and Petta, 1998, p. 4)
The component Perception and Appraisal: The sensory layer consists of feature detectors for the detection of, for example, sudden, intensive stimuli or the quality of an stimulus (e.g. pleasantness). The schematic layer compares the input with patterns, particularly with social and self patterns. The conceptional layer can, based on propositional knowledge and beliefs, think abstractly and infer.
The component Action: The motor layer contains motor commands. The schematic layer contains action tendencies and what Frijda calls "flexible programs" (Frijda, 1986, p. 83). The conceptional layer is responsible for coping.
Between these two components moderates the Appraisal Register which goes back to a suggestion of Smith et al. (1996). It discovers and combines the appraisal results of the three layers of the Perception and Appraisal component and affects, on the basis of the appraised state, the Action component.
The Action Monitoring component finally observes the planning and execution processes of the Action component and conveys the results to the Perception and Appraisal component, where they are integrated into the appraisal process.
Staller and Petta call their system a situated cognizer. With this term they want to underline the importance of both components for an autonomous system. They define cognizing (a term suggested first by Chomsky) as "having access to knowledge that is not necessarily accessible to consciousness" (Staller and Petta, 1998, p. 5).
Botelho and Coelho define emotion in the context of their Salt & Pepper project as "a process that involves appraisal stages, generation of signals used to regulate the agent's behavior, and emotional responses" (Botelho and Coelho, 1997, p.4). With Salt & Pepper they want to define an architecture containing mechanisms which play the same role for autonomous agents as the mechanisms that make humans so successful.
Starting point of their considerations is the classification of emotions in a multidimensional matrix ""that may be used with any set of emotion classification dimensions" (Botelho and Coelho, 1997, p. 4).
Table 10: Dimensions of emotion classification (after Botelho and Coelho, 1997, p. 5)
The authors differentiate between affective and cognitive appraisal and postulate that it is, in principle, possible to differentiate clearly between these two components in any given architecture. They call the respective modules Affective Engine and Cognitive Engine.
The Affective Engine and the Cognitive Engine differ in three respects:
The authors suggest a mechanism which makes it possible for the Affective Engine to react quickly: the reduction of explicit and long comparison chains to short, specific rules. They describe an example of such a process:
if someone risks dying, he or she will feel a lot of fear;
risks_dying(A) -> activate(fear, negative, 15)
if someone risks running out of food, he or she risks dying;
risks_running_out_of_food(A) -> risks_dying(A)
if someone risks running out of money, he or she risks running out of food;
risks_running_out_of_money(A) -> risks_running_out_of_food(A)
if someone loses some amount of money, he or she risks running out of money;
loses_money(A) -> risks_running_out_of-money(A)
loses_money(A) -> activate(fear, negative,15)
(Botelho and Coelho, 1997, p. 11)
These explicit and implicit rules should be organized in a hierarchy in which the longer rules are used only if no suitable short rule is found.
The Salt & Pepper architecture consists of three main components: the Affective Engine, the Cognitive Engine and an Interrupt Manager. The Affective Engine possesses Affective Sensors, an Affective Generator, and an Affective Monitor. The latter two initiate the process of emotion production together. All other modules of the system (except the Interrupt Manager) are assigned to the Cognitive Engine.
Fig. 21: Salt & Pepper architecture (after Botelho and Coelho, 1997, p. 12)
The long-term memory is an associative network. Each node of the network possesses an identification, an activation level, a set of associations with other nodes and a number of symbolic structures which represent motives, plans, actions and declarative knowledge. The more a node is activated, the easier it is noticed by a search process (accessability).
The Input Buffer and the Affective Generator activate nodes in long-term memory. The Cognitive Monitor and the Affective Monitor suggest certain nodes for the attention of the agent. If such a suggestion process runs, the Interrupt Manager decides whether the current cognitive process is to be interrupted and the content of the suggested node is to be loaded into working memory to be processed. If the contents of a node in working memory are processed, the node receives a certain level of activation and thus more accessability.
Nodes which are based on certain experiences of the agent are called episodic nodes and form the episodic memory.
Emotions are described in this system by a set of parameters:
The emotion program differs from the emotional reaction by the fact that it is executed by the Affective Generator, without interrupting the current cognitive processing of the agent.
The Affective Generator undertakes a partial evaluation of the external and internal state of the agent, the so-called affective estimate. If the calling conditions of a certain emotion are fulfilled, the Affective Generator produces the label, the intensity and the valence of the emotion and executes the emotion program. The Affective Monitor then scans the long-term memory, until it finds a node which corresponds to the label of the emotion and possesses the same valence. It activates it with an activation level which represents a function of the produced emotion intensity.
The system contains mechanisms which make emotional learning possible. The authors differentiate between three classes of emotional learning:
The authors specify the conditions under which a system is able to accomplish these learning procedures (Botelho and Coelho, 1998).
Some elements of Salt & Pepper were implemented so far and, according to the authors, have confirmed the theoretical assumptions (Botelho and Coelho, 1997).
Canamero (1997) also pursues an approach based on Minsky's "Society of Mind" (1985). In a two-dimensional world called Gridland live the Abbotts, artificial organisms which have a motivational and emotional system.
An Abbott consists of a number of agents which, viewed individually, are quite "simple", but reach a new quality when they interact with one another. An Abbott possesses three kinds of sensors (somatic, tactile, visual); two kinds of recognizers which react to complex stimuli and can both learn and forget; eight so-called direction nemes which supply informations from the spatial environment of the Abbott; two categories of maps (tactile and visual) which receive their information from the recognizers and direction-nemes and represent these internally; three effectors (hand, foot, mouth); a behaviour repertoire (Attack, Drink, Eat, Play, Rest, Withdraw etc..) as well as a set of managers (e.g. finder, look-for, go-toward) which correspond with appetitive behaviour. Furthermore, the Abbotts possess a set of physiological variables, e.g..adrenalin, blood sugar, endorphines, body temperature etc..
The Abbotts move in a world which contains sources of food, obstacles and enemies. They come into this world as "newborns", equipped with a basic set of characteristics, and must then develop in their environment.
What is interesting in Canamero's model is that her creatures are equipped from the outset with motivations and emotions. They are called, after Minsky, proto-specialists , because they are primitive mechanisms responsible for action selection and control functions.
Theoretical basis for the motivations is a homoeostatic approach:
The motivational agents of the Abbotts consist of
Thus the error message "too low blood sugars" calls the motivation hunger, for example, whose goal is it to increase the blood sugar level. The activation of a motivation is proportional to the size of the error message (or the deviation of a physiological value from the homoeostatic state); according to the activation level the intensity of the motivation is computed. The motivation with the highest activation level tries to organize the behaviour of the Abbott in such a way that the associated drive is satisfied. If the motivation cannot find and call an appropriate behaviour, it activates the finder agent and hands to it the intensity value, so that it can pass it on to other agents who are activated by it. The intensity affects a behaviour substantially: with the escape behaviour, for example, the strength of the motor activity, with other behaviours, for example, their duration.
Activation level and intensity of a motivation can now be modified by emotions. In Canamero's system emotions are composed of
Emotional states are activated and differentiated from each other by three kinds of elicitors:
Since Abbott is a primitive system, it is always in a clear emotional state. The three elicitors are arranged hierarchically in the order mentioned above. The selected emotion affects the action selection mechanism in two ways: It can lower or increase the intensity of the current motivation and thus at the same time also the intensity of the selected behaviour; besides it modifies the results of the sensors which measure the variables that affect the emotion and changes thereby the perceived physical state (happiness agent - > release of endorphin - > less pain perception).
The action selection of an Abbott thus takes place in four stages:
Canamero grants that her Abbotts at present still operate on a very primitive level and need a number of additional agents in order to develop long-term learning and strategies. Emotions play a substantial role in her model:
11.7. Summary and evaluation
As the preceding examples show, there exist a variety of approaches to model emotions in the field of autonomous agents . The connections to psychological theory among them are quite different.
It is noticeable that most authors are quite eclectic with their theories. They fall back predominantly upon theories which are suitable for an operationalization. Frequently, only certain elements are picked out which are then extended by own components, often without making this explicitly clear.
In order to obtain fast results, only parts of the sketched models are implemented in real simulations or robots. Pragmatic solutions are used which necessarily reduce complex processes to some few variables. Besides, these variables are frequently arbitrarily defined in order to be able to realize the model at all.
It is remarkable that the majority of the authors regards emotions as a substantial component of the control system of an agent and defines them in this regard functionally. Emotions are regarded no more as appendages of the cognitive system, but rather as an indispensable condition for the reliable functioning of cognition.