Abstract

Introduction
During the last years, the International Journal of Advanced Robotic Systems, under the Topic of Vision Systems, especially welcomes papers that cover any aspect of biologically inspired vision in robots. 1 As Guest Editors of the Special Issue on “Biologically Inspired Vision Systems in Robotics,” we feel that living beings have still much to tell us about the design and development of robotics.
In fact, biologically inspired robotics is an area experiencing an increasing research and development. Based on nature success strategies, robotics researchers are interested in gaining an understanding of the sensory aspects that would be required to mimic nature’s design with engineering solutions. 2,3
In the relation between Robotics and Biology, let us highlight that the mammalian nervous systems exhibit complex computational functions including sensory functions, motor function, and, in humans, abstract thought. In particular, pattern recognition exhibited in our olfactory, visual, and auditory functions are of particular interest to the electronic and computing communities. Meanwhile, several approaches attempt to mimic/substitute sensory or neural elements (missing by congenital state or due to pathological processes) in order to enable/restore function by establishing neuro-electronic interfaces.
Thus, biologically inspired vision systems provide considerably more computational power and efficiency to artificial vision systems and autonomous robotic guiding. Furthermore, this extraordinary biological connectivity is coupled with natural unsupervised learning based on varying connective efficiency.
This Special Issue has attracted a number of papers that show the benefits of biologically inspired vision systems in advanced robotics. The topics of this Special Issue explicitly included (but were not limited to) the following aspects: early vision in robotics; bioinspired computer vision; bioinspired hardware/software design for robotics; bioinspired segmentation and grouping; bioinspired motion and tracking; bioinspired recognition and classification; bioinspired scene analysis and understanding; bioinspired stereo and active vision; and bioinspired visual navigation and simultaneous localization and mapping.
The papers
Finally, six was the number of accepted papers for this Special Issue on “Biologically Inspired Vision Systems in Robotics.” This section offers an overview of each of the papers accepted.
The first paper 4 introduces a proposal for pedestrian detection in traffic scenes based on a weekly supervised hierarchical deep model. Firstly, a traditional one-dimensional deep belief network is expanded to two-dimensional (2-D) that allows image matrix to be loaded directly to preserve more information of a sample space. Then, a determination regularization term with small weight is added to the traditional unsupervised training objective function. By this modification, original unsupervised training is transformed to weakly supervised training. Subsequently, that gives the extracted features discrimination ability. Multiple sets of comparative experiments show that the performance of the proposed algorithm is better than other deep learning algorithms in terms of recognition rate. This outperforms most of the existing state-of-the-art methods in nonocclusion pedestrian data set while performing fair in weakly and heavily occlusion data sets.
The second article 5 presents an approach to human-inspired visual tracking. The authors propose a tracker inspired by the cognitive psychological memory mechanism, which decomposes the tracking task into sensory memory register, short-term memory tracker, and long-term memory tracker like humans. The sensory memory register captures information with three-dimensional (3-D) perception; the short-term memory tracker builds the highly plastic observation model via memory rehearsal; and the long-term memory tracker builds the highly stable observation model via memory encoding and retrieval. With the cooperative models, the tracker easily handles various tracking scenarios. In addition, an appearance-shape learning method is proposed to appropriately update the 2-D appearance model and 3-D shape model. Extensive experimental results on a large-scale benchmark data set demonstrate that the proposed method outperforms the state of the art of 2-D and 3-D trackers in terms of efficiency, accuracy, and robustness.
Then, a proposal on panoramic camera tracking on future planetary rovers using feedforward control is offered. 6 An essential component of the search of salient features is the locating and tracking of targets at camera control level. The rover visual system must be able to follow quantified information gradients for smooth tracking in the visual field with limited information from images and delayed positional feedback caused by long communication delays inherent to planetary exploration. This paper proposes a control algorithm based on vestibulo-ocular reflexes employed by the human cerebellum. The controller uses a feedback error learning model, which is able to track targets by compensating the rover motion at the pan–tilt using a network trained prediction of the pan–tilt dynamics. The feedforward controller proves capable in tracking objects in the visual field as was demonstrated in both simulation and on the Barrett whole arm manipulator.
The next paper 7 introduces a self-learning method of robotic experience for building episodic cognitive map using biologically inspired episodic memory. The episodic cognitive map is used for robot navigation under uncertainty. Two main challenges which include high computational complexity and perceptual aliasing are addressed. An episodic memory-driving Markov decision process is proposed to simulate the organization of episodic memory by introducing neuron activation and stimulation mechanism. Uncertain information is considered to improve mapping performance. The presented method performs robotic memory real-time storage, incremental accumulation, integration, and updating. Based on the episodic cognitive map, the predicted episodic trajectory is simply computed by activation spreading of state neurons. The experimental results for a mobile robot indicate that the method efficiently performs learning, localization, mapping, and navigation in real-life office environments.
The fifth article of the Special Issue proposes an approach to vision-based people detection using depth information for social robots. 8 In the context of interaction between social robots and people, it is crucial that robots are aware of the presence of people around them. Traditionally, people detection has been performed using a flow of 2-D images. However, in nature, animals’ sight perceives their surroundings using colour and depth information. In this work, the authors present new people detectors that make use of the data provided by depth sensors and red–green–blue images to deal with the characteristics of human–robot interaction scenarios. The disparity of the input and output data used by these types of algorithms usually complicates their integration into robot control architectures. This article proposes a common interface to be used by any people detector, resulting in numerous advantages. According to the results offered, a clever combination of several algorithms appears as a promising solution to achieve a flexible, reliable people detector.
The last paper offers a proposition for autonomous positioning control of a manipulator. 9 In order to solve the challenges of real-time calculations of the positioning error, error correction, and state analysis in the process of autonomous positioning, a Kinect depth imaging equipment is used, and a particle filter based on three-frame subtraction to capture the end-effector’s motion is proposed. Further, a backpropagation neural network is adopted to recognize targets. Point cloud library technology is used to collect the space coordinates of the end-effector and target. Finally, a 3-D mesh simplification algorithm based on the density analysis and average distance between points is proposed to carry out data compression. Accordingly, the target point cloud is fitted quickly. The experiments conducted in the paper demonstrate that the proposed algorithm detects and tracks the end-effector in real time. Furthermore, the gradual convergence of the end-effector center to the target centroid shows that the autonomous positioning is successful. Compared to traditional algorithms, both moving the end-effector and a stationary object can be extracted from image frames. The computational complexity is reduced and the camera calibration is eliminated.
