Abstract
The ongoing and increasingly important trend in robotics to conceive designs that decentralize control is paralleled by currently active research paradigms in the study of perception and action. James Gibson's ecological approach is one of these paradigms. Gibson's approach emerged in part as a reaction to representationalist and computationalist approaches, which devote the bulk of their resources to the study of internal processes. The ecological approach instead focuses on constraints and ambient energy patterns in the animal-environment coalition. The present article reviews how the emphasis on the environment by ecological psychologists has given rise to the concepts of direct perception, higher order information, active information pick up, information-based control laws, prospective control, and direct learning. Examples are included to illustrate these concepts and to show how they can be applied to the construction of robots. Action is described as emergent and self-organized. It is argued that knowledge about perception, action, and learning as it occurs in living organisms may facilitate the construction of robots, more obviously so if the aim is to construct (to some extent) biologically plausible robots.
Keywords
1. Introduction
Since the late 80s, research in robotics and artificial intelligence has witnessed the emergence of a paradigm, called behaviour-based robotics, that reacts against the traditional model of intelligence as deliberative reasoning [1,2]. Instead of focusing on how a computer program can beat a human chess master, or on how to control an industrial robotic arm to perform faster and more accurate movements, proponents of this paradigm argue that a more suitable model for intelligence research would be adaptive behaviour. Behaviour-based robotics emphasizes how agents are engaged in an ongoing interaction with their environment, and how they continuously adjust their behaviour tuning internal and external processes to realize their goals [3].
As an example of the type of adaptive behaviour that we are referring to, consider the ability of insects to compensate for a lost leg (cf. [4]). An adaptive self-assembly of the system maintains the organism within the functional range to achieve a given goal, in this case, locomotion. Different neurons, muscles, or limbs may be recruited to achieve the final goal [5]. In the same vein, behaviour-based robotics attempts to build robots through networks of simple but functional behaviours, often mapping sensors to actuators without use of a central model [6–8]. This flavour of robotics shares a common theoretical ground and has strong parallels with ecological psychology.
Ecological psychology, as developed by Gibson [9–11], affirms that animal and environment constitute a functional unit, a coalition: “animal and environment make an inseparable pair. Each term implies the other. No animal could exist without an environment surrounding it. Equally, although not so obvious, an environment implies an animal to be surrounded” [11]. In the ecological view, “a coalition is not a system plus context, but the minimal system that carries its own context” [12]. Therefore, if laws of behaviour exist, they are to be found at the ecological scale, that is, in the animal-environment coalition.
Introducing the environment to understand animal behaviour vastly changes the scientific challenge. Because the environment is highly structured due to natural laws, animals do not need explicit knowledge inside their heads to develop meaningful behaviour, they just need to be tightly coupled to their environments. In a nutshell, it is easier to put the organism into the environment than to put the environment inside the organism. The general strategy of ecological psychology is to push explanations of cognition based on natural law as far as possible [13]. As described in [14], this strategy reduces or even replaces the need for more traditional explanations in term of symbolic and representational processes, which are derived from the digital-computer metaphor.
Both behaviour-based robotics and the ecological approach eschew explanations based on central models in favour of distributed control, and both do so by focusing on the action capabilities of the agent and on the environment. Given these parallels, in the present contribution we set out to review aspects of the ecological approach that we consider relevant to roboticists. Reviewed concepts include higher-order variable, invariant, active information pick up, direct perception, control law, and prospective control. Representative examples of psychological research are included, and the work of Duchon and colleagues [13] is featured as an example of how these concepts can be applied in the construction of robots. In the later part of the contribution, attention is devoted to an ecological approach to learning, referred to as direct learning, and to the dynamics of action.
2. Visual system and direct perception
The ecological theory of visual perception was proposed by Gibson [11], who claimed that perceiving is not about minimizing noise in “sensory channels” in order to build a detailed and reliable representation of the state of affairs in the world. On the contrary, energy arrays in the environment are highly structured due to constraints that limit their possible states, and hence make perception possible. For instance, a necessary condition for visual perception is the existence of the laws of optics, which determine how a ray of light will behave under different circumstances. If the laws of optics did not hold, visual perception would not be possible.
Many different constraints lawfully determine a wide set of relationships among measurable magnitudes in environmental energy arrays, some of which are complex to describe (for scientists at least) and extended over substantial spatial and temporal intervals. In the ecological jargon, more sophisticated properties of ambient energy arrays are called higher order variables, in order to contrast them with more elementary ambient energy properties, or lower order variables. Contrary to elementaristic approaches (see [15]), ecological psychology rests on the assumption that higher order variables can be detected without requiring inferences or intermediate symbolic states. No matter how complex the relationships between magnitudes are, perceptual systems may be entailed with the capacity to detect them. Because no cognitive intermediaries are needed in the detection process, perception is said to be direct [16].
Renowned examples of higher order variables are found in optic flow. Optic flow can casually be defined as the movement of the projections of objects and surfaces as seen from a point of observation. Optic flow can be represented with a vector field, with each vector indicating the direction and speed of the movement. If the vectors point outwards, the point of observation approaches a surface; if the vectors point inwards, the point of observation retreats. Furthermore, if approaching, the node without flow from which the vectors emerge, referred to as the focus of expansion, indicates the location toward which the point of observation is heading. A substantial part of the ecological approach is devoted to the study of higher order variables in optic flow, in static optic arrays, and in other ambient energy arrays (related to audition, touch, etc.).
Some of the higher order variables in the energy patterns are specific (one-to-one related) to properties in the environment that are relevant to the control of action. Such specifying higher order variables are called invariants. Whenever invariants are detectable in the energy arrays, they will invariably allow for the perception of the relevant environmental property. A canonical example of an invariant is the abovementioned focus of expansion, which specifies heading direction. Another canonical example is the optical variable τ, which specifies the time to contact between the point of observation and a surface or an object, given that the current velocities remain unchanged [17] (cf. [18]).
To illustrate the use of these concepts in robotics, let us briefly consider the work of Franceschini et al. [19–21]. Using knowledge about the physiological basis of elementary motion detectors of the housefly, these authors designed small optic flow sensors. They constructed and tested lightweight (tethered) helicopters that were endowed with flow sensors and simple control systems. In contrast to more traditional autopilots that use knowledge about (or representations of) quantities such as groundspeed and flight altitude, the autopilots successfully control aspects of flight (such as the lift [20,21]) on the basis of the detected flow.
At this point it is important to note that animals are not passive: to make their way in the world, they actively explore their environment, seeking possibilities to achieve their goals through actions, and generating sensory flows that can help to control those actions [10]. Therefore, perceiving is not passively receiving invariants that happen to stimulate the sensory surfaces (this could be considered a degenerate case of normal perception), but an active picking up of the invariants. Rather than complicating the task, active perception brings new constraints into the animal-environment system (cf. [22]). Hence, the notion of active perception allows ecological psychologists more opportunities to identify invariants in the ecology of the studied organism.
Methodologically, an ecological psychologist studies the possible informational invariants that guide particular actions, and the constraints that hold during the task and that guarantee the specificity of perceptual invariants. There are many laws (or law-like regularities) that provide constraints to the perception-action loop, and many ways of classifying them. A classification that may be of use for a designer of artificial adaptive systems is based on their degree of generality. We can distinguish: 1) universal constraints, based on mathematical, physical, chemical or biological laws (e.g., gravity, geometrical and physical optics, or physiological laws valid across taxa); 2) ecological constraints, valid for the members of a certain species, such as the construction of the body and the limbs; 3) task-dependent constraints, only valid for the very specific combination of events and layouts present in a task (e.g., predator-prey interactions, experimental manipulations, or the rules of a game).
In contrast to the computational paradigm, the ecological approach assumes that knowledge of constraints is not explicitly encoded in a set of cognitive symbols and abstract rules, but is implicit in the fine-tuned co-organization of environment and agent – their coalition [14]. Although it is logically possible to conceive a symbolic system that behaves indistinguishably from its ecological counterpart, we consider it misleading to include more components than needed in the explanans (in addition to the epistemological puzzles that symbolic theories imply [23,24]). For ecological psychology, perceptual systems are smart devices [25]: they adaptively rearrange biological components to generate soft-assembled dynamical structures that take advantage of constraints in an opportunistic fashion – that is, without requiring explicit knowledge – in order to sustain an adaptive coupling with the environment.
The technical challenge of designing an artefact analogous to a soft-assembled smart device might seem titanic. This is indeed the case. We must not forget the large number of adaptation scales living systems are subject to. Couplings with the environment are shaped by adaptations from the scale of evolution to the scale of perception-action down to yet smaller scales. Given the overwhelming pool of adaptive resources that life is made of, it is not possible to provide an exhaustive account of behaviour with only one of those scales (e.g., perception-action). A more integrative perspective on the concept of adaptation is required. Recent robotic platforms [26,27] may be a good test-bench for this. After addressing perception-action (Section 3), an example of ecological robotics (Section 4), and the notion of prospective control (Section 5), we therefore consider the issues of calibration and learning (Sections 6 and 7).
3. Perception-action loops as control laws
The previous section described reliance on optic flow invariants as a means to control intended actions. Control laws formalize the operation of action upon detected invariants at a convenient level of abstraction. The specific control laws ecological psychologists are interested in are mappings between task-specific informational variables and action variables that describe the observed behaviour [28] (see also the concept of task compliant control in the engineering literature [29]). If behavioural regularities are observed at the selected level of abstraction, then actions depend in a systematic way on information due to the mutual constraints expressed in the control law.
According to what is controlled, there are several possible formulations for these control laws, each one with its strengths and weaknesses. Kinematic control laws are functions that relate informational variables directly to kinematic movement variables (such as position, velocity, or acceleration) and kinetic control laws relate information to the forces that produce the final movement. Influential examples of kinematic control laws can be found in τ coupling theory [30], a theoretical framework that aims to understand movement trajectories as determined by the continuous coupling of two task-relevant and reciprocal τ variables. Despite the wide theoretical use of kinematic control laws, kinetic control laws appear better suited to roboticists because they avoid the often-problematic inverse kinematics problem. The control laws given in Sections 4 and 5 are kinetic control laws.
Both types of control laws (i.e., kinematic and kinetic) are defined at a level of abstraction that spans the entire action system, neglecting the relevance of the action system in shaping behaviour. Dynamic control laws, in contrast, do not directly determine movement parameters, but modulate the dynamics of the action system [31,32], which in turn determine the movement parameters. A more mathematical portrayal holds that dynamic control laws specify a vector field with attractors and repellors on the low dimensional manifolds where the trajectories of the action system unfold, hence generating the actual movement kinetics and kinematics. This is the more complete but also the more challenging approach to the concept of control law. The approach is more challenging because control laws relating information flows and action-system dynamics cannot be directly measured. At least some hidden dynamical variables must be inferred, for instance from behavioural analysis (cf. [31]).
An example of a dynamic control law can be found in [33]. The authors of this article extend an existing dynamical model about the emergence of locomotion from the interaction of neural, musculo-skeletal, and environmental elements [34]. The extension consists of the online coupling of a model parameter, related to step length, to a higher order informational variable related to the adequacy of the current step length. We return to issues concerning action dynamics in Section 8.
4. A robotic implementation of visually guided locomotion
The work of Duchon et al. [13,35,36] illustrates how robotic systems can be developed on the basis of concepts from ecological psychology. These authors built robots that produce basic locomotive behaviours using kinetic control laws defined with optic flow variables. Their robots avoid obstacles. A simple strategy to regulate such behaviour is to equate the optic flow in the lateral portions of both hemifields. This strategy has been shown to be used by bees flying down a corridor [37]. Several control laws can be given that are related to this strategy. One of these is:
In this equation, Δ(FL-FR) is the force difference as applied to the two sides of the agent's body, k is a constant, and Σ||w→L|| and Σ||w→R|| are the sums of the lengths of the optic flow vectors on each side of the focus of expansion (each side of the middle of the field of view, in the case of [36]). Note that this strategy exploits constraints on optic flow due to the relative distance of objects: closer objects give rise to stronger optic flow, and, for similarly sized objects, they take up more space on the optic array, biasing the ratio towards their associated flow. This causes the robot to turn away from nearby objects, without turning into another object.
Duchon et al. [13] describe implementations of this control law on two robots. The smaller robot was endowed with a camera with a horizontal field of view of 60°, with the ability to detect four frames per second of a 128×32 flow field. The larger robot had a field of view of 110° and was able to detect 10 frames per second, of a 128×92 flow field. This allowed the smaller robot a maximum safe speed of 4 cm/s and the larger one a maximum safe speed of 30 cm/s. The robots were tested in poorly lit and cluttered environments, with tables and chairs, people, and wires on the floor. Two “emergency reflexes” were added to the smaller robot because it sometimes navigated in the challenging light conditions under the table tops. An intensity reflex stopped and turned the robot when no flow was detected (either too dark or too bright), and an immediacy reflex produced the same response when a detected τ specified that a collision was imminent (remember that τ is an optic flow variable related to time-to-contact). In the larger robot, in addition to the main control law, a control law that regulated speed as a function of the total magnitude of the optic flow was implemented.
Both robots were able to wander through their environments, demonstrating the utility of the control law. Interesting results were also obtained in simulation experiments, which allow more convenient explorations of the space of possible behaviours, and of the relationships between parameters and behaviours. For instance, in simulations of a single agent moving at a fixed speed in a fixed environment, the emergent behaviour converged toward one of a few attractors, even though the simulations started from random positions and orientations. Adding noise to the system generated transitions between the attractive paths. Variations of the field of view dramatically affected performance. Narrow fields (60°) resulted in less adapted behaviour, with the agent often getting stuck in a corner. Fields wider than 180° also resulted in degraded performance because the agent tended to respond to obstacles that were already avoided. Varying the speed of the agent changed the detection of passability of gaps. The faster the agent, the more conservative with respect to gaps. Duchon et al. [13] also describe the implementation of control laws that use optic flow information to allow the chasing of and fleeing from moving targets (cf. [35]).
It is worthwhile to emphasize that the movement of the robots in the experiments by Duchon et al. emerge on-line from the agent-environment interaction. The robots do not plan paths in advance. They do not apply internal algorithms that optimize the length, energy expenditure, safety, or other aspects of the paths. They do not follow a scheme in which perception, cognition, and action are independent and sequential processes. The robots do not have internal maps, and no internal entities can usefully be interpreted as symbolic representations of, say, objects, other robots, or the robots themselves. These aspects make the systems consistent with ecological psychology and behavioural-based robotics, and set them apart from a substantial number of other theories and approaches.
5. Prospective control
So far in this article we have described the ecological concepts of higher order information, invariant, and control law, and we have seen how such concepts can be applied in robotics. The general ecological argument is that the more useful the detected information, the less the need for hypothetical internal processing. A particular type of internal processing that can be obviated in this way is the (often computationally intensive) compensation of delays through predictive processing. The part of the ecological theory that concerns such issues makes use of the concept of prospective control.
In environments with moving objects, action is often controlled with respect to future states of the environment. For instance, in catching fly balls in baseball, catchers do not run towards the current location of the ball, but towards a future location where the ball may be intercepted (e.g., in many cases they run backwards while the current location of the ball is in front of them). To accommodate such findings, a broad class of theories, referred to as predictive theories, assume that agents detect information about current states, apply internal simulation models to predict future states, and then control action using the internal predictions of the future states [38]. This type of control is thought to be necessary especially for controllers with sensory-motor delays, because, the argument goes, such controllers would otherwise apply forces that are computed on the basis of states of the environment that are in fact past states at the time that the forces are applied.
The theory of prospective control does not accept this argument. Prospective information can be defined as information that specifies future states of the environment, future states of the agent, or future states of the agent-environment relation [39]. An example of prospective information is the above-described focus of expansion of the optic flow. This higher order variable specifies at which point the agent will contact the approached surface if the current movement directions remain unchanged. If the agent detects prospective information, it perceives future states. Perceiving future states, in turn, obviates the need to obtain the future states through predictive internal simulations.
As a second example of prospective information we can consider an approximately-sinusoidal environmental process. The state of such a process 0.1 s into the future can be predicted/approximated from the current position and velocity, using, say, an internal Euler simulation. Alternatively, one may detect the fractional derivative of the sinusoidal signal. A fractional derivative of non-integer order α of a function f(t) is an extension of the usual concept of derivative and can be denoted as fα(t). Although fractional derivatives may perhaps be more difficult to detect than regular first-order derivatives, it can be shown that fractional derivatives specify future states of sinusoidal processes [40]. Detecting fractional derivatives hence allows agents to control action with regard to future states of the environment, without need of Euler-like internal predictions.
Among the more substantial bodies of research on prospective control is the work related to the optical acceleration strategy (e.g., [41–43]) and to the bearing angle strategy (e.g., [44,45]). The bearing angle strategy holds that in order to intercept horizontally moving targets, humans and other species move so as to cancel changes in the bearing angle, which is defined as the angle between the direction of movement of the agent and the current direction of the to-be-intercepted target as seen from the perspective of the agent. It has been suggested, to give two examples, that such a strategy may be applied by dragonflies in the pursuit of prey [46] and by drivers in the crossing of intersections (interpreted as the interception of moving traffic gaps [47]). The bearing angle strategy is ecological (and prospective) because it does not require predictions about where and when the interception takes place. Instead, the agent detects and uses information that specifies whether the current agent-environment relation, in the absence of changes in speeds and directions, leads to a future interception.
The optical acceleration strategy concerns the interception of fly balls. Imagine a fly ball that requires only forward-backward movement of the agent in order to be intercepted (i.e., no lateral movement). Given the constraint that terrestrial gravity is approximately constant, it can be shown that the current speed of the agent leads to interception if and only if the tangent of the angle between the horizontal and the line from the agent to the ball increases at a constant rate [41]. In other words, this variable indicates whether or not the agent needs to change his or her speed: if the optical acceleration is nonzero, the agent needs to accelerate or decelerate. Evidence indicates that human behaviour is indeed consistent with an optical acceleration strategy [42,43]. An example of how this type of research may inspire roboticists can be found in [48]. We now turn to the issue of learning.
6. Learning of visually guided action
A kinetic control law, as described in Section 3, can be formalized as follows: F=f(I), in which F is the exerted force, f is a single-valued function, and I is the operative informational variable. Such control laws should not be considered as hard-wired and unchangeable. On the contrary, empirical evidence indicates that perceivers change in which informational variable they use and in the single-valued function that maps the used variable to force [49–51]. In laboratory tasks, it is possible to impose constraints that turn highly useful variables into less useful ones, and vice versa. In such situations participants tend to converge towards the variables that are useful in their specific experimental environment. In the Gibsonian tradition this kind of adaptation is called the education of attention [10], because participants change with respect to what higher order informational variable they attend to.
For the sake of simplicity, let us assume that F=k*I. In other words, we assume that the exerted force, F, is directly proportional to the detected variable, I. In this portrayal, in addition to I, the value of k is subject to adaptation: one may be attending to a useful informational variable but not using it correctly to achieve a successful action. This kind of adaptation, called calibration, has been observed in several empirical studies (e.g., [50]). The education of attention and calibration are adaptive processes that ensure the specificities required by control laws to generate meaningful behaviour, from birth till death. For novel tasks – which imply new sets of constraints – some invariants are no longer useful, while other invariants appear. The role of the education of attention and calibration is to rearrange the system on a slower timescale in order to maintain the necessary specificities on the faster timescale of the perception-action control laws.
There is evidence that suggests that changes due to the education of attention are continuous rather than discrete. Usually organisms do not jump from the use of one informational variable to the use of a completely different one. Rather, the new variables used by participants tend to produce only slightly different outcomes, which is to say that organisms change to variables that have similar levels of utility. To account for this continuity, the authors of [52] proposed that informational variables can be portrayed as forming a continuous space, referred to as information space (cf. [51]). When appropriately defined, such a space should contain all the variables that individuals exploit for a task. Each point in the space represents a different informational variable, and each variable offers a different level of support to performance.
To illustrate the notion of information space, we consider a recent study on learning to balance a pole attached to a cart that can be moved in one dimension [40]. This cart-pole study also illustrates the concepts of control law and prospective control. The starting point of the work is the assumption that the manual control of the cart depends on visual information, namely, optical correlates of the angle of displacement of the pole from the vertical, θ, and its rate of change, θ̇. To obtain a one-dimensional information space that includes these variables, the notion of fractional derivative is used (see previous section). The fractional derivative of order α of β is denoted as θα. The closer the value of α to 0 or 1, the more similar the variable to θ or θ̇, respectively. The parameter α can be interpreted as the coordinate of a one-dimensional space, the information space, and changes in which informational variable an individual uses can be tracked as a movement in the space.
This information space, and the additional assumption of a constant perceptuo-motor delay, d, allows the formulation of the following control law: F(t)=k*θα(t-d), where the force at time t is said to be specific to the α-th fractional derivative of θ d seconds before. Performance of individuals, as described by this control law, is determined by a point in the information space and a calibration constant. Thus, the performance of participants at any time during learning can be described as a point in a joint information-calibration space (α,k).
Participants in the experiment reported in [40] practised the pole-balancing task until their performance reached a criterion level. The movement of the cart-pole system was registered, and the equations of motion of the system were used to compute the forces exerted during the balancing. These forces allowed the authors to determine the locus in the information-calibration space that best explained performance, for each individual and at each moment of the learning process. Learning indeed went together with movements in the space, indicating changes in which variables were used (education of attention) and changes in the value of k (calibration). More precisely, this portrayal showed a narrowing of the distributions of the loci of participants in the space. That is, smaller ranges of the parameters were used as the learning progressed. Also shown was a tendency in the distributions to gradually shift toward lower values of α. Simulations confirmed that this movement in the space corresponded with a movement towards the more useful loci (i.e., towards the loci that allowed more stable performance).
These results can be interpreted in terms of prospective control. Controlling a cart-pole system on the basis of θ is difficult because even a small perceptual-motor delay makes the control unstable. Fractional derivatives of θ, however, approximate the value of θ a small time interval into the future. Controlling the cart-pole system with a perceptual-motor delay and on the basis of a fractional derivative is therefore, to some extent, equivalent to a hypothetical control based on a delayless use of θ. This experimental work is hence an illustration of the fact that prospective information may obviate the internal processing that is often supposed to be necessary to compensate perceptual-motor delays.
To summarize, instead of scrutinizing hypothetical internal processes, the ecological approach devotes resources to the search for higher order variables, and to experimental tests of whether or not organisms use these variables. Particularly useful are prospective variables. The ecological research programme may inspire roboticists to devote resources to the development of sensors that detect higher order variables. Furthermore, the psychological research indicates that learning can at least partly be understood as changes in which informational variables are detected. For roboticists this implies that in order to be biologically plausible, it may be necessary that learning be at least partly implemented as change in the sensors that detect higher order variables. The next section provides more detail on how such change is effected.
7. Direct learning
Control laws allow action to unfold without need of internal cognitive processes such as path planning, and without symbolic representations of objects in the environment. This is because the moment-to-moment action, rather than being based on previously planned paths, is a direct function of higher order information. In the previous section it was argued that control laws are continuously adapted due to changes in variable use and calibration. One might wonder, then, whether in contrast to the action itself, these changes in the control laws require internal processes and representations. That is, do education of attention and calibration require cognitive/symbolic mechanisms to decide when and how to change variable use and calibration? The theory of direct learning claims that this is not the case [52,53].
The reasoning behind the theory of direct learning is analogous to the reasoning behind the theory of direct perception, which holds that perception is specific to higher order information instead of being an internal construal achieved by symbolic processes. Hence, the theory of direct learning claims that changes due to learning are specific to higher order information. As do other parts of the ecological theory, this claim implies a methodological doctrine for those who aim to understand learning: rather than devoting research efforts to the elaboration of hypothetical internal processes that supervise learning, the theory of direct learning suggests such efforts be devoted to searching for sources of information that guide learning. We next describe a study by Bruggeman et al. [54] as an example that may be interpreted from the framework of direct learning.
Bruggeman et al. [54] performed their study in a large virtual reality environment. In the normal use of this virtual environment, the actual walking direction of participants is registered and converted without manipulation in a walking direction in the virtual world. In the experiment reported in [54], however, this relation was purposely manipulated by displacing the virtual walking direction 10° to the right. Initially, the walking paths of individuals showed important deviations in the direction of the 10° displacement, but, while perceiving and acting in the manipulated environment, the deviations got smaller and smaller. From a direct learning perspective two questions arise. First, what is the change in the operative informational variables (education of attention) and/or in the control laws (calibration) that underlies the reduction of the deviations? And second, what is the information for learning that specifies such change?
The research of Bruggeman et al. [54] relates to the second question. Consider a first hypothesis. Hypothetical individuals that are perfectly adapted are expected to walk straight towards virtual targets. Curved paths are expected, in contrast, for individuals who walk through the manipulated virtual world with a deviation to the right. This is so because, without changing the direction of walking, these individuals would pass the target on the right, and would thus see the target drifting to the left. To cancel the drift such individuals continuously turn to the left. In sum, walking on straight trajectories is indicative of a well-adapted visuo-motor system, and walking on curved trajectories is indicative of an ill-adapted visuo-motor system. As a consequence, establishing an appropriate coupling between the amount and direction of curvature and the change in the visuo-motor system leads to a system that sooner or later converges to being well adapted. In the jargon of the direct learning theory, one would say that the curvature (detectable by vision, proprioception, or both) is a candidate variable that may serve as information for learning.
A second hypothesis considered by Bruggeman et al. [54] relates to the above-described focus of expansion of the optic flow. For a well-adapted individual who walks straight towards the virtual target, the focus of expansion remains on the target. In contrast, for an individual who walks through the virtual world according to the 10° mismatch of the physical and virtual walking directions, the focus of expansion is located 10° to the right of the target. This means that a mismatch between the focus of expansion and the virtual target is indicative of an ill-adapted perceptuo-motor system, and hence that this higher-order flow variable may guide the adaptation. The research reported in [54] provides empirical support for the second hypothesis: the control of walking is adapted based on information related to optic flow more than on information related to trajectory curvature.
As interpreted from the direct learning perspective, the study reported in [54] is an example of how researchers may search for information that specifies change due to learning. Other experiments that aim to reveal information for learning can be found in [55] and [56]. For roboticists it is of interest to know candidate variables for information for learning. Detecting such variables and coupling them to the change in perceptuo-motor systems may lead to robots that adapt to changes in the constraints in the environment in which they operate (and/or to changes in the robots themselves). In addition, if the aim is to achieve biologically plausible robots, then the detected information for learning should correspond to candidates that have been supported by psychological research. Note again that, instead of being planned by a cognitive/symbolic system (inspired by the computer metaphor [14]), such learning lawfully emerges from the on-line interaction of agent and environment.
8. Dynamics of the action system: the Bernstein perspective
One of the aspects that behaviour-based robotics and ecological psychology have in common is the considerable attention paid to adaptive processes. In Section 6 we described the calibration and the education of attention. With a formulation of a control law as F=f(I), these processes correspond to changes in the single valued function, f, and in the operative informational variable, I, respectively. Although our examples imply definitions of the exerted force, F, we did not consider details about how biological agents generate such forces. Neither did we consider learning processes that concern the way in which the forces are generated. It may be useful for roboticists, however, to be aware that action is currently conceived as being to a large extent emergent and self-organized. In this section we briefly indicate a few aspects of the large and active research field on the dynamic organization of action.
The production of skilled action requires the coordination of large numbers of elements in the musculoskeletal system. Typically considered elements are: joints, entire muscles, smaller muscle units, etc. Nikolai Bernstein [57] formulated this control problem as the now-well-known “degrees of freedom” problem: a control system has to select, within the many-dimensional motor system, a specific combination of activations of effectors to produce a desired movement, which is low-dimensional compared to the movement apparatus itself. Due to the high number of dimensions of the effectors with respect to the task, there are equivalent solutions for each intended trajectory. This redundancy in degrees of freedom is a resource for the simultaneous flexibility and stability achieved by the motor system. The redundancy facilitates motor learning, seen as exploration of adaptive possibilities within the high dimensional action system.
In Bernstein's view [57], the motor system is organized in levels that emerged at different phases of the phylogenetic process. Bernstein distinguished four levels: A) the level of tone, which is the oldest level in an evolutionary sense, with a particular concern for trunk and neck muscles; B) the level of muscular-articular links, or synergies, which, mainly concerned with the extremities, relies on propioceptive afference to make collections of degrees of freedom act as a single functional unit; C) the level of space, which concerns the integration of movements with the spatial layout of the environment, relying, for instance, on sensory information from visual and auditory systems; and D) the level of action, which is the youngest level in an evolutionary sense and the most typically human level, concerning the intentional control of movements and the framing of individual movements into chains of goal-directed actions. According to Bernstein, movements are controlled by a leading level (in humans typically level C or D), but the leading level crucially relies on lower background levels without being concerned with the aspects of the movements that are addressed by these lower levels.
The dynamical systems approach has provided insights to understand coordination as the soft-assembly of degrees of freedom into low dimensional manifolds [58]. Rather than specifying all the details of the required force-time profile to be exerted by the effectors, controlling a complex dynamical system implies setting a set of parameters (typically, of much lower dimension than the system itself), after which the movement emerges from the interaction with the environment according to natural laws. The action system brings constraints and interactions to sustain adaptive behaviour, further constraining cognition with natural laws.
This dynamical systems perspective is consistent with Bernstein's portrayal, according to which the leading level does not plan detailed movements. It does not elaborate a program or script with motor commands for the control of all degrees of freedom. Rather, the final movement emerges from the broad constraints provided by the leading level together with the labour of the lower levels, the respective sensory corrections of the different levels, and the inertial and reactive forces of the body and the environment. The emergent aspect of motor control is also consistent with ecological psychology: those actions that emerge from multilevel interactions with constraints from the motor system (as well as from the environment) need not explicitly be computed by symbolic cognitive processes.
Without being exhaustive, let us mention that the contributions of dynamicist research to the understanding of Bernstein's levels and their interactions include the threshold control hypothesis (also known as equilibrium point theory [59]), the uncontrolled manifold hypothesis [60], the referent configuration hypothesis [61,62], and the tensegrity hypothesis [63,64]. We mention this research because it provides elegant accounts for the amazing complexity of the control of action, and thereby complements the focus of the present article, which gravitated towards the perceptual side of perception-action systems. We also find it interesting to speculate that the direct learning perspective may be elaborated with (and may contribute to) the Bernstein perspective, raising the question of how interaction with the environment generates different types of information for learning that may specify the different changes due to learning that occur, in parallel, at the different levels of the construction of movement.
9. Conclusions
We have outlined how the concepts of information and control as used within ecological psychology may be useful for the design of artificial systems with robust and flexible behaviour, which is to say, behaviour adapted to the environment. This is an objective shared with the approach called behaviour-based robotics. Both ecological psychology and behaviour-based robotics stress that laws of behaviour are to be found in the continuous coupling of agent and environment. This scale of analysis is termed the ecological scale.
Energy in the environment is not randomly distributed but highly structured due to natural laws. Thus, animals do not need explicit knowledge of the state of affairs in the world to develop adaptive couplings. Rather, they need to establish couplings between the action parameters and the higher order invariants in environmental energy fluxes. This type of control is called information-based control, and requires that the agent is capable of detecting the relevant information to control the intended actions. Systems capable of detecting such variables are called smart perceptual systems.
Information-based control can be formalized with control laws in which the controlled action variable and the detected invariants are in a one-to-one mapping. Once a goal is chosen, behaviour unfolds lawfully in a direct loop of perception-action, that is, without meaningful internal states. The work presented in [13] is a relevant example of an application of information-based control to robotics. Another essential feature of information-based control is prospectivity. Computationalist theories often assume that control is predictive to account for goal-directedness. Their decision and control algorithms require future states of the environment as input, typically obtained through predictive internal processes. On the contrary, prospective control is based on prospective invariants that specify either future states of the environment or whether the currently ongoing interactions can achieve the desired goal.
The notion of information space implies the consideration of a continuous space that includes informational variables available in the environment. Information spaces may be useful to account for the continuous changes in performance that can be observed in animals' behaviour. As an example, we have described a recent study on the informational basis of learning to balance a cart pole [40].
Adaptivity of information-based control is granted by processes that update control laws with regard to changes in constraints that hold in a task, or with regard to the acquisition of new skills. The theory of direct learning affirms that there is information for learning in the consequences of our actions, that this information specifies the changes required in the control laws, and that animals are capable of detecting this information. This information specifies the adaptive directionality for learning in information-calibration spaces, thus unloading the agent with the burden of internal inferential processes that keep track not only of their online control, but also the course of their learning.
Footnotes
10. Acknowledgments
This material is based upon work supported by grant FFI2009-13416-C02-02 of the Spanish Ministry of Science and Innovation. The funders had no role in the decision to publish or in the preparation of the manuscript.
