A neuromorphic approach to obstacle avoidance in robot manipulation

Abstract

Neuromorphic computing mimics computational principles of the brain in silico and motivates research into event-based vision and spiking neural networks (SNNs). Event cameras (ECs) exclusively capture local intensity changes and offer superior power consumption, response latencies, and dynamic ranges. SNNs replicate biological neuronal dynamics and have demonstrated potential as alternatives to conventional artificial neural networks (ANNs), such as in reducing energy expenditure and inference time in visual classification. Nevertheless, these novel paradigms remain scarcely explored outside the domain of aerial robots. To investigate the utility of brain-inspired sensing and data processing, we developed a neuromorphic approach to obstacle avoidance on a camera-equipped manipulator. Our approach adapts high-level trajectory plans with reactive maneuvers by processing emulated event data in a convolutional SNN, decoding neural activations into avoidance motions, and adjusting plans using a dynamic motion primitive. We conducted experiments with a Kinova Gen3 arm performing simple reaching tasks that involve obstacles in sets of distinct task scenarios and in comparison to a non-adaptive baseline. Our neuromorphic approach facilitated reliable avoidance of imminent collisions in simulated and real-world experiments, where the baseline consistently failed. Trajectory adaptations had low impacts on safety and predictability criteria. Among the notable SNN properties were the correlation of computations with the magnitude of perceived motions and a robustness to different event emulation methods. Tests with a DAVIS346 EC showed similar performance, validating our experimental event emulation. Our results motivate incorporating SNN learning, utilizing neuromorphic processors, and further exploring the potential of neuromorphic methods.

Keywords

Event-based vision spiking neural networks obstacle avoidance manipulation neuromorphic

1. Introduction

Modern autonomous systems excel at specific tasks, but lack capabilities that the average human exemplifies, including rapidly learning for and accomplishing various unstructured tasks. A potential reason is the fundamental differences in human and artificial intelligence (AI) due to their respective physical substrates (biological vs silicon-based) (Korteling et al., 2021). AI systems can have superior information propagation speeds, communication bandwidths, and raw computational power. Though this may imply a greater capacity for multi-sensory processing, robotic agents embodying AI still struggle with everyday tasks that our brains facilitate with relative ease. This poses the question of whether the information propagation mechanisms and computational architectures present in our brains could be key factors in the advancement of reliable autonomy.

The human brain maintains multiple, complex cognitive processes while being more energy-efficient than contemporary computers. It simultaneously regulates essential processes and controls numerous high-level functions while consuming ∼20W of power (Drubach, 2000). Comparable and less-capable computers require power inputs that are many orders of magnitudes higher. This is evident when simulating cortical neural networks on conventional computers; at the scale of a mouse, such a simulation could run at 40,000 more power and 9000 times less speed, while projections from the Human Brain Project underscore the colossal power requirement for simulating a human brain (Thakur et al., 2018). Such neuroscientific studies indicate the vastly superior performance-to-efficiency ratio of the brain and motivate the consideration of biologically-inspired circuitry, sensors, and algorithms in intelligent robot design, which is the aim of this work.

Biological inspiration has frequently driven practical innovations, such as IR detectors and gyroscopes (Wicaksono, 2008), learning paradigms such as evolutionary algorithms and reinforcement learning (RL) (Sutton et al., 1998), and locomotion control using elementary neural circuits (Ijspeert, 2008). Artificial neural networks (ANNs) are based on their biological counterparts: the earliest models consist of interconnected layers of simple computational units with adjustable synaptic connections, while later convolutional neural networks (CNNs) aim to mimic connectivity patterns observed in animal visual cortices.

CNNs have been largely successful in some visual processing tasks (see Goodfellow et al., 2016) and are commonly deployed on robots. Nevertheless, these models are crude approximations at best. Biological neurons asynchronously aggregate inputs over time and propagate discrete, sparse spikes (action potentials), whose precise timings are thought to encode useful information. Conversely, ANN neurons synchronously propagate real-valued signals, abandoning an additional temporal dimension afforded by relative spike timings. These discrepancies have become relevant following observations that deep neural networks (DNNs), despite their notable successes, still suffer from ever-growing numbers of parameters, correspondingly sizable data requirements, poor generalization to unobserved yet similar inputs, and catastrophic failures in response to minor perturbations (Serre, 2019). Our brains seem better-equipped to handle an extraordinary range of problems while exhibiting superior generalization and robustness than present AI models.

1.1. Neuromorphic computing

Neuromorphic engineering/computing reproduces characteristics of the brain in hardware. The field was established in the 1980s by Carver Mead, who suggested analogies between neuronal dynamics and the physics of sub-threshold regions of transistor operation (Mead, 1990). This has given rise to various models of spiking neural networks, neuromorphic processors, and neuromorphic sensors. This research is motivated by the pursuit of brain-like computation to improve efficiency, parallelization, and energy consumption (Thakur et al., 2018) through a fundamental paradigm shift, which could benefit various applications including autonomous vehicles, wearable devices, and IoT¹ sensors (Rajendran et al., 2019). Naturally, it holds promise for robotic systems as well.

Various studies provide empirical evidence of the advantages of neuromorphic computing: 10 and 1000 factor increases in speed and energy-efficiency when learning on a neuro-processor compared to a conventional processor (Wunderlich et al., 2019), a four-fold increase in the energy-efficiency of a spiking network versus a DNN for speech recognition (Blouw and Eliasmith, 2020), better energy-per-classification ratios in deep learning problems when comparing to a Tesla P100 GPU (Göltz et al., 2021), and speed and efficiency gains in various classical problems (Davies et al., 2021). These results have coincided with, or perhaps fostered, an ongoing interest in the field, typified by the recent establishment of the “Neuromorphic Computing and Engineering” journal (Indiveri, 2021).

1.2. Event cameras

Event cameras (ECs) are neuromorphic sensors that are modelled after biological retinas. ECs exclusively record per-pixel events at which the change in intensity crosses a threshold, mimicking retinal photoreceptor cells (Posch et al., 2014). Consequently, they only capture significant intensity changes, often due to motion (Figure 1), in contrast to frame-based cameras, whose pixels synchronously and continuously transmit absolute values, much of which is often redundant information.

Figure 1.

Images captured with a DAVIS346 EC. 1a shows an RGB image of a fast-moving object and 1b visualizes the captured events 1c and 1d show events superimposed on images captured while moving the camera.

ECs offer lower power consumption, lower transmission latencies, higher dynamic ranges, and more robustness to motion blur than traditional cameras (Gallego et al., 2022; Chen et al., 2020a), which have been experimentally validated (Sun et al., 2021). These properties can address limitations of frame-based vision in power consumption and bandwidth due to transmitting larger amounts of data (Dubeau et al., 2020). External clock-driven data acquisition naturally leads to redundant image information and the potential loss of inter-frame information (Risi et al., 2020). Instead, ECs selectively acquire data based on scene dynamics. ECs have most often been deployed in applications requiring rapid reaction speeds, such as drone flight, due to characteristically high temporal resolutions.

Event-based data necessitates correspondingly novel methods and algorithms; the artificial analogs of visual cortical cells, spiking neurons, are prime candidates.

1.3. Spiking neural networks

Spiking neural networks (SNNs) are a neuromorphic alternative to ANNs in which computational units propagate sparse sequences of spikes (Figure 2). Input spikes contribute to a neuron’s decaying internal aggregate of past inputs: its membrane potential. When a threshold is exceeded, the neuron emits a spike (or action potential) which resets its potential. Spiking neurons operate asynchronously, such that information flows through the network through trains of distinctly-timed spikes. These neuronal dynamics match those observed in biology (particularly in the primary visual cortex; Chen et al., 2020a), but raise challenges concerning the encoding/decoding and processing of spiking data as well as learning methods, which remain open areas of research.

Figure 2.

Neuronal dynamics in conventional ANNs (left) and SNNs (right). Gray/black borders signify inactivity/activity; only the bottom SNN neuron is active here, since it had just spiked.

Similar to the event/frame-based camera dichotomy, SNNs have stimulated interest due to their potential advantages over ANNs. SNNs can be as, or potentially more, expressive than ANNs (Maass, 1997), and successful applications in common visual tasks have shown that they can consume less power and exhibit faster classification inferences (Neil et al., 2016), as well as outperform ANNs in energy-delay-product² (Davies et al., 2021). This energy-efficiency is due to neurons emitting outputs only when significantly stimulated, as opposed to constant computations, though this advantage is fully realized with dedicated neuromorphic hardware. This is also reflected in response latencies; in classification problems, SNNs could produce a correct inference before a frame-based approach could fully process all input pixels (Neil et al., 2016).

Although often associated with embodied agents, SNNs have been applied in other areas of research, such as for detecting Covid-19 from CT scans (Garain et al., 2021).

1.4. Contribution

In this paper, we investigate the viability and utility of neurologically-inspired sensors and algorithms in the context of robotics by implementing and experimentally evaluating a neuromorphic approach to a common problem. We specifically address real-time, online obstacle avoidance on a camera-equipped manipulator by designing a neuromorphic pipeline that enables end-to-end processing of visual data into motion trajectory adaptations, and which relies on event-based vision and an SNN.

In our pipeline, an EC emulator transforms RGB data into event data, which is in turn processed by a convolutional SNN. The SNN output is then decoded into avoidance velocities by a dedicated component that employs a potential fields (PF) method. The result is used to adapt a pre-planned end-effector trajectory using a dynamic motion primitive (DMP) formulation in a motion control component. These components, collectively referred to as an SNN-based obstacle avoidance module, operate in a closed-loop, online procedure for transforming a trajectory into an adapted version that avoids obstacles through SNN feedback.

ECs excel at capturing rapid motions, while SNNs are best-suited to process event data, thus motivating their utilization for deriving corrective avoidance motions. A significant amount of research on obstacle avoidance deals with UAVs and mobile robots, while relatively few works address manipulation and take a similar neuromorphic approach. This work is unique in addressing manipulator obstacle avoidance by utilizing event data from an onboard camera, SNN processing, and an adaptive trajectory representation. Most solutions rely on external cameras and classical methods for filtering images, removing backgrounds, object segmentation, etc. (refer to Section 3.4 for examples of classical approaches). Our approach may show that event data and SNN processing could eliminate the necessity of manual operations that preclude generalization over environments, lighting conditions, and platforms.

Due to the scarcity of comparable works, proposed evaluation methodologies, and performance metrics, we formalize a set of quantitative metrics and qualitative criteria with which we evaluate our implementation. We conduct experiments first in simulations and then on a real Kinova Gen3 arm. These experiments constitute repeated executions in a range of obstacle scenarios drawn from a task distribution. During simulation tests, we use different task scenarios to tune, validate, and finally test candidate sets of parameter values. The results demonstrate consistent success in avoiding obstacles, which a non-adaptive baseline is incapable of. We also analyze statistical performance across many trials, showing that the adapted trajectories do not drastically increase execution times, trajectory lengths, and velocities. Moreover, we assess how reliable, predictable, and safe the trajectories are in our qualitative evaluation, and find at least moderately positive results in each.

In further analyses, we additionally explore certain properties of the neuromorphic elements. We implement and compare different event emulation strategies, based on their event outputs, SNN responses, and ultimate task performances; this yields the insight that the SNN exhibits robustness to variations in event data. To validate its utility, we test the exclusion of the SNN and the derivation of obstacle avoidance accelerations directly from raw events, which we show adversely affects the success of obstacle avoidance. We also investigate the effect of varying SNN weights, showing a slight variance in performance. Finally, we show the successful integration of a real EC: a DAVIS346, and present results of preliminary experiments. Most notably, we find that the resultant performance is fairly similar, validating our experimental emulation method and the compatibility of the pipeline with a real camera.

The paper is structured as follows. Section 2 provides additional background information on event-based vision, SNNs, and DMPs. Section 3 contains a review of related work on the relevant areas of research. In Section 4, we present our proposed approach, elaborating on the design principles and the pipeline components. Section 5 is dedicated for a detailed description of our evaluation methodology, including evaluation tasks, metrics and criteria, and experiment design and procedures for both the simulation and real experiments. Section 6 discusses the results of these experiments. Section 7 presents the analyses concerning the event emulation and SNN components, as well as the real EC tests. We conclude in Section 8.

2. Preliminaries

2.1. Event-based vision

Silicon retinas were developed to imitate the neural architectures of biological retinas in analog VLSI circuits, giving rise to neuromorphic computing (Mahowald, 1994). Contemporary models are known as neuromorphic retinas, dynamic vision sensors (DVS), or event cameras (ECs). EC pixels mimic retinal ganglion cells by asynchronously emitting a binary signal, i.e. an event, only when the incident light intensity significantly changes. Using an address event representation (AER), the pixel arrays provide streams of spatio-temporally-registered events which encode typically interesting information, such as motion.

An event can be represented as a tuple of pixel position, emission timestamp, and polarity: e_k = (x_k, t_k, p_k). Pixels are independently monitored to compute a measure of the difference in successive intensities, the simplest example of which is the difference in raw intensities:

Δ L (x_{k}, t_{k}) = L (x_{k}, t_{k}) - L (x_{k}, t_{k - 1})

(1)

When this difference crosses threshold θ, event e_k is emitted and designated an ON (+1) or OFF (−1) event:

p_{k} = \{\begin{cases} + 1, & if Δ L (x_{k}, t_{k}) > θ \\ - 1, & if Δ L (x_{k}, t_{k}) < - θ \end{cases}

(2)

An EC transmits a stream or vector of such e_k tuples.

It follows that event data can be emulated from conventional camera data, a method often used due to the scarce availability of or relative expense of acquiring ECs (García et al., 2016; Hu et al., 2021; Rebecq et al., 2018).

2.2. Spiking neural networks

Spiking neural networks (SNNs) are more biologically-plausible models in which neurons communicate via asynchronous spikes to replicate the complex neuronal dynamics observed in the brain. Figure 3 illustrates these spiking dynamics.

Figure 3.

An illustration of spiking neuron dynamics.

Neurons propagate sequences of sparse, binary spike trains, across their synapses. Each pre-synaptic spike contributes to the post-synaptic potential (PSP) or membrane potential, v, of a post-synaptic neuron: an internal analog representation of neuronal activation that decays over time, and an approximation of ionic concentrations across a nerve cell’s membrane. A neuron whose potential exceeds v_thresh emits a spike and enters a short refractory period in which it is inhibited from spiking (not depicted). Following a spike, the potential is reset to a baseline value, v_reset. Individual neurons therefore fire or are active only in response to a significant aggregate of recent inputs. Information can be contained in average spiking rates and/or the relative timings of spikes, a concept drawn from neuroscientific evidence (Fairhall et al., 2001). The time-varying potential signal inherent to every neuron and the independent spiking latencies create an additional temporal dimension in SNNs.

Various mathematical approximations and abstractions of neuronal dynamics have been employed in SNN research to produce a variety of neuron models, including the Hodgkin-and-Huxley model (Hodgkin and Huxley, 1952), Spike Response Model (SRM) (Gerstner, 1995), Izhikevich model (Izhikevich, 2003), and various probabilistic models (Jang et al., 2019). The most commonly used model is the leaky integrate-and-fire (LIF) model (Rajendran et al., 2019; Lee et al., 2020; Dupeyroux et al., 2021).

A LIF neuron represents a leaky integrator modeled as an RC circuit with an equation describing membrane voltage:

τ_{v} \frac{d v}{d t} = - (v - v_{rest}) + I (t)

(3)

where v_rest is a resting potential, τ_v is a potential decay constant, and I(t) is the sum of input currents. I(t) is a sum of input spikes, S(t), arriving from pre-synaptic neurons indexed by i, multiplied by synaptic weights, w:

I (t) = \sum_{i = 1}^{n_{l}} w_{i} S_{i} (t)

(4)

where S_i(t) is 1 if a spike occurs at time t and 0 otherwise. If v exceeds v_thresh, a spike is emitted and v is reset to v_reset:

v (t) \leftarrow v_{reset}, if v (t) > v_{thresh}

(5)

The neuron is prevented from firing again for a refractory period, T_refrac. The LIF neuron thus maintains a decaying memory of past inputs and implements fundamental spiking and refractoriness properties. The model is popular for its computational simplicity. In our implementation, we use a variant that was presented in Diehl and Cook (2015).

2.3. Dynamic motion primitives (DMP)

Dynamic motion primitives capture and reproduce discrete or rhythmic motions using a set of differential equations that produce stable global attractor dynamics (Ijspeert et al., 2013). For discrete motions, DMPs model the evolution of a position variable, y, from an initial to a goal position, g, through equations describing a linear spring-damper system augmented with an additional non-linear term:

\begin{align} τ \ddot{y} & = α_{y} (β_{y} (g - y) - \dot{y}) + f (s) \end{align}

(6)

\begin{align} τ \dot{s} = - α_{s} s \end{align}

(7)

where α_y and β_y control spring and damping characteristics and τ is a time scaling constant.

f (s)

represents a forcing term which describes trajectory shape through a superposition of basis functions and depends on a phase variable,

s

f (s) = \frac{\sum_{i} w_{i} ψ_{i} (s) s}{\sum_{i} ψ_{i} (s)}

(8)

where ψ_i are basis functions (usually of a Gaussian kernel) and

w_{i}

are learnable weights that capture the desired trajectory shape. A demonstrated trajectory can be learned by sampling y,

\dot{y}

, and

\ddot{y}

then solving for a least-squares solution

w_{i}

using linear optimization. Equation (7) describes the evolution of

s

from 1 to 0 (the start to the end of the motion). Function

f

depends on

s

instead of a time variable, which enables scaling the trajectory in time by modifying τ.

3. Related work

3.1. Event-based vision

Event cameras (ECs) are inspired by biological retinas’ superior efficiency compared to conventional, frame-based cameras. The silicon retina (Mahowald, 1994) paved the way for modern ECs including the DVS (Lichtsteiner et al., 2008), ATiS (Posch et al., 2010), and DAVIS (Brandli et al., 2014)³. Refer to Gallego et al. (2022) for an extensive survey of event-based vision.

Recent publications highlight applications of event-based vision in various domains. A survey of bio-inspired sensing for autonomous driving, presented in Chen et al. (2020a), provides a review of EC signal processing techniques and successful implementations for segmentation, recognition, optical flow (OF) estimation, image reconstruction, visual odometry, and drowsiness detection. In a more resource-constrained scenario, event-based vision has been employed in a gesture recognition smartphone application in Maro et al. (2020). In Dubeau et al. (2020), a combination of RGB-D and EC data for DNN-based 6-DOF object tracking is shown to outperform an RGB-D-only DNN.

In the context of robotics, ECs have most often been utilized on aerial robots, such as for high-speed obstacle avoidance (Falanga et al., 2020) powerline tracking (Dietsche et al., 2021), and fault-tolerant control (Sun et al., 2021). UAVs particularly benefit from the advantages of ECs on account of frequent high-speed motions and motion blur, energy consumption constraints, and rapid changes in illumination. We focus here on applications that do not involve flight, where ECs may similarly provide opportunities for improving robot capabilities.

Bečanović et al. used precursory optical analog VLSI (aVLSI) sensors in soccer robots, harnessing the faster reactivity properties to improve ball control and goalkeeping (Bečanović et al., 2002a, 2002b). Simultaneous localization and mapping (SLAM) has been addressed with stereo visual odometry using DAVIS346 ECs in Zhou et al. (2021b), facilitating scene mapping and ego-motion estimation and matching the performance of mature frame-based approaches on benchmark datasets, while demonstrating robustness in difficult lighting conditions. In Arakawa and Shiba (2020), EC data was used for visual RL of tracking and avoidance policies. The authors train in simulations using emulated data and deploy the model on a robot equipped with a DAVIS240. In Chen et al. (2019), event data was used to train DNNs for Atari gameplay and action recognition, outperforming RGB image-based networks. While these works adapt DNNs to process event frames, our approach utilizes SNNs, which are naturally suited to processing the spike-like event data.

For this work, we implemented a software component that emulates event data from a stream of conventional images in a live camera feed or a ROS topic (see Section 4.2). pydvs is a python-based DVS emulator that can similarly convert image intensity differences to rate- or time-encoded spikes/events (García et al., 2016), but does not natively support ROS. The prominent ESIM simulator provides a framework for simulating 3D scenes (in OpenGL and the Unreal rendering engines) and user-defined camera motions as well as event outputs (Rebecq et al., 2018). Microsoft’s AirSim offers an EC emulation component which is similarly coupled to its rendering engine. Naturally, such a coupling limits applicability to the associated simulation. The v2e toolbox (Hu et al., 2021) was designed to address assumptions of the ESIM simulator that deviate from real cameras, but is limited to video files. In Joubert et al. (2021), the ICNS emulator for video files and Blender scenes was qualitatively compared to ESIM, v2e, and a real DVS. Despite some limitations (coupling to rendering engines and the absence of out-of-the-box support for live feeds and ROS data), all reviewed emulators are publicly available and provide useful tools for research and development in event-based vision.

3.2. Spiking neural networks

Research into SNNs is motivated by attempts to approach the computational capability coupled with energy efficiency of the brain (as expressed by the most works reviewed in this section), compared to the more specialized yet drastically less efficient computer systems of today (Roy et al., 2019).

The so-called third generation of neural networks can be potentially more expressive than first and second-generation NNs in addition to requiring significantly less neurons to represent some functions (Maass, 1997). The authors of Neil et al. (2016) converted ANNs pre-trained for MNIST digit recognition into SNNs and analyzed the comparative performance. The SNNs achieved comparable accuracy using significantly less computational operations (42-58% less) and in less time. These and similar results demonstrate that SNNs could be as expressive/accurate as ANNs, while consuming less power and exhibiting faster inference.

The prevalence of deep learning with ANNs has ignited research into the same in the spiking domain, not least because of prospective improvements in energy efficiency. Pfeiffer and Pfeil (2018) and Tavanaei et al. (2019) provide extensive overviews of SNNs and focus on methods for training deep SNNs, reviewing spiking analogs of CNNs, RNNs, LSTMs⁴, and echo state networks. The authors of Bouvier et al. (2019) reviewed hardware implementations that leverage SNN characteristics and associated challenges. Jang et al. (2019) presents a probabilistic view of SNNs, the main advantage of which is to facilitate gradient-based learning and other well-known statistical methods.

SNN implementations have been demonstrated for solving common AI problems, particularly involving vision. Diehl et al. trained two-layer SNNs with STDP⁵ for MNIST digit recognition and achieved state-of-the-art classification accuracy among unsupervised methods (95%) (Diehl and Cook, 2015). In Mirsadeghi et al. (2021), a supervised learning algorithm, STiDi-BP, is shown to achieve an accuracy of 99.2% on MNIST. Spike-YOLO was created by converting a pre-trained Tiny-YOLO model to an SNN, achieving comparable results on the PASCAL and COCO datasets, while being 2000 times more energy efficient (Kim et al., 2020). Similarly, the supervised STBP-tdBN algorithm achieved state-of-the-art results on the CIFAR and ImageNet datasets and was among the few to successfully train relatively deep SNNs (of 50+ layers) (Zheng et al., 2020). Zhou et al. presented object recognition in datasets of DVS (N-MNIST, DVS-CIFAR10, etc.) and LIDAR (KiTTi) data, demonstrating the applicability of SNNs to different data modalities (Zhou et al., 2021a).

Various works have successfully applied SNNs in robotics. An extensive survey of SNNs for robot control is presented in Bing et al. (2018b), underscoring significant potential for improving speeds, energy efficiency, and computational capabilities. In Bing et al. (2018a), SNNs were trained using R-STDP on DVS event data for lane-keeping on a mobile robot, outperforming a conventional Braitenberg controller. Zahra et al. utilized shallow SNNs to learn a differential sensorimotor mapping for a UR3 robot that supports reliable Cartesian control (Zahra et al., 2021). In a fully-embedded applications of SNNs (on an Intel Loihi neuro-processor), Dupeyroux et al. designed a neuromorphic vertical thrust controller for landing a quadrotor (Dupeyroux et al., 2021). The input is a spike representation of OF divergence that is estimated using data from an onboard CMOS camera. The SNN was trained (with an evolutionary algorithm) and evaluated on a neural simulator, PySNN, achieving consistent landing behaviour. A limitation on the generality of this approach is the controller’s dependence on a pre-set visual pattern for OF estimation. In the present work, we similarly utilize a neural simulation tool.

An event-driven, SNN-based PD thrust controller was designed to achieve high-speed orientation adjustment on a dual copter in Vitale et al. (2021). The authors demonstrated superior control speeds and reductions in latencies, particularly when running on a neuro-processor (vs a CPU). In Risi et al. (2020), reliable stereo matching of event data from two DAVIS sensors was achieved using an SNN architecture designed with neuronal populations implementing coincidence and disparity detectors, and running on a DYNAP neuro-processor. This approach was particularly favoured for the temporal dimension of the asynchronous event and SNN spike data, which enable exploiting temporal coincidences and thus improve stereo matching. Similar to our approach, no learning is involved; the SNN architecture’s inherent properties are shown to be beneficial in realizing the desired behaviour.

Most publications in robotics address relatively constrained navigation and flight tasks (e.g. lane-keeping and 1D thrust control). Our work demonstrates a less common application in obstacle avoidance for manipulation.

3.3. Neuromorphic computing

Neuromorphic computing/engineering mimics the fundamental neural architectures and dynamics of the brain in silico, aiming to replicate its superior energy efficiency, compute, and robust learning capabilities in computer architectures and engineered systems. Common approaches incorporate asynchronous event-driven communication, spike-based neural processing, analog neuronal dynamics, and local synaptic adjustments. This research serves a dual purpose: enhancing AI systems with lessons from neuroscientific research, and advancing our understanding of the brain by experimenting with neurologically-inspired platforms. Among the most prominent results are ECs and neuro-processors designed to run SNN architectures. A review of neuromorphic hardware and applications can be found in Rajendran et al. (2019), which highlights the role of neuro-processors in realizing the full potential of SNNs for exploiting event-based sensing, learning and inference.

Neuro-processors model membrane potential evolution using voltages across capacitors or transistor sub-/supra-threshold dynamics, and transfer spikes via an AER. Furber (2016) provides a survey of pioneering neuro-processors, namely IBM’s TrueNorth, Stanford’s Neurogrid, BrainScaleS and SpiNNaker, and compares their performances. Other notable surveys provide similar statistical comparisons of these processors, in addition to the more recent Intel Loihi, DYNAP, PARCA, Braindrop, ODIN, and Deepsouth (Bouvier et al., 2019; Thakur et al., 2018; Rajendran et al., 2019). The Loihi and its applications have been extensively discussed in Davies et al. (2021). The variety of domains the Loihi was shown to be successfully implemented in demonstrates the general applicability of SNNs, while the quantified gains in energy efficiency validate the benefits of the neuro-processor. In addition, the authors find that conventional DNNs exhibit little to no benefits when run on the Loihi, but SNNs achieve orders of magnitude less energy consumption and latency in some applications.

Several publications investigated the effects of neuro-processors on speed and energy efficiency. SNNs trained with R-STDP on a BrainScaleS2 processor to control an agent in playing the game of Pong and achieved one and three orders of magnitude improvements in speed and energy efficiency when compared to a CPU simulation (Wunderlich et al., 2019). The authors of Ceolini et al. (2020) achieved hand-gesture recognition with a neuromorphic sensor fusion approach, where DVS streams and EMG signals (converted to spikes/events) were used to train SNNs running on a Loihi or ODIN. With respect to a GPU-based implementation, this was at least 30 times more energy-efficient, though inference was 20% slower. Taunyazov et al. (2020) presented a visual-tactile SNN (VT-SNN) which fuses data from an EC and a spike-based tactile sensor to accomplish robot manipulation tasks requiring object classification and slip detection. The SNN classifiers running on a Loihi performed similar to state-of-the-art DNNs run on GPUs while consuming 1900 times less power and exhibiting lower latency. In Göltz et al. (2021), backpropagation was used to train SNNs for MNIST classification on a BrainScaleS and then compared to the performance of a conventional CNN running on an NVidia Tesla P100 GPU. The authors show that the neuromorphic implementation is (approx. 100 times) more energy-efficient, at the cost of a slight drop in accuracy and the number of classifications per second (since the GPU implementation utilizes parallelization, while individual images must be processed sequentially on the SNN).

The related field of neurorobotics studies the design of computational structures that are inspired by the human and animal nervous systems in robots (Van Der Smagt et al., 2016) and has led to various interesting applications (Dumesnil et al. (2016); Lobov et al. (2020); Falotico et al., 2017). A review in Chen et al. (2020b) demonstrated the utility of neurorobotics for explaining how neural activity gives rise to intelligence, as a form of computational neuroethology. Robotics could thus similarly benefit from ongoing research and development efforts aimed at drawing inspiration from the brain.

In the present work, we do not run SNNs on neuromorphic hardware, instead aiming to investigate the utility of an event-based SNN approach on conventional hardware. Nevertheless, the reviewed research motivates exploiting the potential gains in energy efficiency and latency in a subsequent study. On a broader scale, this may additionally inspire the pursuit of neuroethology-based robot designs that advance further into the realm of biological realism.

3.4. Obstacle/collision avoidance

Obstacle avoidance is a critical feature for planning robot motions in evolving, dynamic environments, where obstacles may appear during task execution and invalidate motion plans. We discuss seminal works in this domain, focussing on relevant approaches and especially those that address manipulation problems and employ camera sensors.

Research on the rudimentary issue of reactively computing collision-free paths has led to established methods such as vector field histograms (VFH) (Borenstein and Koren, 1991), the Dynamic Window Approach (DWA) (Fox et al., 1997), and the elastic strips framework (Brock and Khatib, 2002). The artificial potential fields (PF) method represents task criteria in the form of attractive and repulsive forces acting on an agent moving within a virtual force field (Khatib, 1986). PF techniques have been extensively applied and improved upon, and are utilized in the present work. A helpful review of classical methods is provided in Minguez et al. (2016).

Optical flow (OF) estimation, a bio-inspired computer vision technique that is applicable to obstacle avoidance, shares similarities to the methods proposed here. OF quantifies the motion of light intensity patterns observed on a sensor as it moves relative to observable objects, and is used by organisms, such as honeybees, for navigation (Van Der Smagt et al., 2016). This can provide estimates of ego- or object motion, which facilitate tracking and collision avoidance. In Schaub et al. (2016), OF is computed from an autonomous car’s monocular camera data and used to optimize maneuvers to avoid obstacles. OF estimation approaches are split into two categories. In estimating object motion, dense OF tracks changes to every pixel between consecutive frames and sparse OF relies on tracking identified features. The former imposes high computational and memory costs, while the latter depends on the reliability of feature matching algorithms and may not generalize if object models have to be specified a priori (Lee et al., 2021). In comparison, our approach is designed to eliminate unnecessary computations, by virtue of the event-based processing, and be independent of specific obstacle features.

Compared to 3D sensors, IMUs, and laser sensors, cameras are less frequently used for obstacle avoidance. Lee et al. (2021) demonstrated DNN-based obstacle recognition and avoidance on a UAV navigating a plantation, where trees are recognized, distances are estimated, and free regions are determined for simple heading adjustments. Limitations include a restriction to obstacles that the DNN is trained to recognize and the entailed computational costs. In Hua et al. (2019), semantic segmentation DNNs recognize roads and obstacles that a mobile robot encounters, which are incorporated in PF-based local path planning. More rudimentary approaches involve classical methods such as detecting contours for obstacle detection (Martins et al., 2018) or using feature extractors like SURF to recognize known obstacles (Aguilar et al., 2017), followed by searching for free regions and applying corrective motions. These approaches may be susceptible to changes in lighting conditions, where ECs are expected to perform better. In addition, purely reactive approaches require rectifying velocity commands that return the agent to its original path. In our work, we propose using DMPs for adaptive path plans that obviate the need for extra corrective computations.

For robot manipulation, obstacle detection and avoidance could be crucial in safety-critical human-robot collaboration settings. The authors of Chiriatti et al. (2021) designed a control law for a UR5 manipulator that incorporates collision cylinders instantiated from estimates of obstacle geometries, positions and velocities, demonstrating constrained avoidance behaviours. The method was tested in simulation with prior obstacle information; extending this to real environments requires dedicated sensors and algorithms to estimate the pose of every person/object. In Safeea et al. (2019), collision bounds on a person are incorporated in a PF approach to controlling a KUKA LBR iiwa. These are obtained using IMUs placed on persons in an industrial workspace. Another approach relies on proximity sensors placed on the manipulator, which provide time-of-flight, IMU, and gyroscope readings (Escobedo et al., 2021). While reliable behaviours are achievable by deploying arrays of sensors, this may limit generalization to different scenarios, especially when a robot is not confined to a controlled workspace or when arbitrary sensor placement is not possible. In contrast, our approach relies on a single onboard camera.

Other notable implementations integrate vision-based sensing, leveraging RGB-D cameras in particular. Mronga et al. used pointcloud data to extract convex hulls of obstacles that are incorporated as constraints in an optimization problem whose solution leads to avoidance motions on a dual-arm system (Mronga et al., 2020). The optimizer leads to task-compliant avoidance, but depends on cameras covering the workspace and several pointcloud processing steps. A similar strategy is applied in Song et al. (2019), where obstacles in a bin-picking task are avoided by detecting moving objects in a pointcloud and accordingly adjusting motions through a PF algorithm. These implementations benefit from depth perception that monocular RGB or event cameras do not provide. However, event-based processing could impose a significantly lower computational overhead than pointcloud processing, which may be a concern in applications that require rapid reactivity.

A few publications are particularly relevant for their similar approaches and thus merit mentions in the remainder of this section. In Park et al. (2008), a DMP obstacle avoidance term was first introduced and used within a PF formulation. We similarly utilize DMPs and draw insights from the authors’ mathematical integration of obstacle avoidance information for the method presented in Section 4. Scoccia et al. presented offline planning of trajectories along with online adjustments using a formulation of PFs that operates on the Jacobian matrices for null space control (Scoccia et al., 2021).

Another implementation of manipulator obstacle avoidance combined PFs and elastic bands for adaptive trajectory planning (Tulbure and Khatib, 2020). Here, an RGB-D camera capturing the workspace provided pointcloud data which is processed to estimate obstacle positions that affect the PF. The authors augment a PF algorithm with an elastic bands planner, which enables adjusting a global plan with minimum deviations, thus addressing the susceptibility of PFs to local minima. This resembles our application of PFs for local velocity corrections, while our DMP maintains a high-level plan. Again, pointcloud processing introduces a computational expense and a dependence on the camera position and/or robot platform, thus placing theoretical limits on generality and applicability to different environments.

The utility of event-based vision for obstacle avoidance has been demonstrated in the past. Milde et al. (2015) presents event-based collision avoidance on a mobile robot by computing OF from DVS data and deriving velocity commands. Here, the usage of ECs is motivated by the redundancy in data and wastage of computations associated with processing conventional camera images, particularly when the robot is stationary. Furthermore, the authors suggest extending their work with a “neuromorphic circuit” and SNNs to address limitations, including the significant amount of data required by their PCA-based method for computing OF. In Sanket et al. (2020), dynamic avoidance on a quadrotor was achieved by training CNNs on event data to estimate the OF of moving objects, while placing priors on obstacle shape (sphere) for tractability. Our work differs in addressing tasks where the robot must avoid obstacles whilst actively moving towards a goal and employing SNNs. In a navigation scenario, Yasin et al. utilized a DVS for car obstacle avoidance in low-light settings, demonstrating superior reaction times when compared to standard cameras (Yasin et al., 2020). Objects in the event image are obtained through denoising, corner detection, segmentation, and filtering procedures, and used to recompute plans. While these efforts present viable applications of event data, much of the requisite pre-processing may be obviated by utilizing SNNs: the natural complement to event-based vision.

This usage of SNNs has nevertheless also been shown in recent years. Salvatore et al. (2020) demonstrated neuro-inspired UAV collision avoidance by running event data on an SNN that was converted from a trained deep Q-Learning (DQN) ANN. Successful behaviours were achieved in AirSim simulations after training the DQN agent on emulated event data, transferring weights to an equivalent SNN, and further training the SNN with data from successful trials. Of greatest similarity to our work is the feasibility study of a neuromorphic approach to obstacle avoidance presented in Milde et al. (2017). Their method involves processing event data from a DVS mounted on a mobile robot in SNNs that are implemented on a ROLLS neuro-processor, whose output is decoded into avoidance and target-following behaviours by aggregating responses of neuron populations. As in our approach, SNN connections are non-plastic (i.e. not adjusted through learning); the inherent properties of the SNN architecture are shown to facilitate viable navigation behaviours. The present work proposes a similar approach in the context of manipulation, which presents a different set of challenges.

4. Proposed approach

We present a neuromorphic approach that incorporates event-based vision and SNNs for adaptive motion execution.

In this approach, we use DMPs to generate trajectory plans for reaching tasks. The DMP formulation supports additive acceleration terms which we utilize to inject obstacle avoidance information and therefore adapt to guide the robot’s motion away from perceived obstacles. This information is obtained by continuously processing visual, event-based data within an SNN, then decoding neural activation maps into avoidance accelerations. The latter procedure involves a potential fields (PF) method for computing the most favourable avoidance direction. By utilizing events induced by relative motion and the spatio-temporal filtering properties of SNNs, we can extract reactive motions that modify high-level motion plans to account for obstacles while maintaining progress towards the task goal, thus achieving real-time, online trajectory adaptation.

A core aspect of our approach is the synergy between global planning and local corrections that enables goal-directed, obstacle avoidance. The common alternative: completely re-planning whenever obstacles are encountered, can impose higher computational expenses and latencies (D’Silva and Miikkulainen, 2009; Feng et al., 2020), which are particularly undesirable in dynamic environments. Instead, we utilize the DMP as an adaptive planner, where perceived obstacles are handled by appropriately adjusting the next waypoint during execution and the global attractor dynamics ensure a graceful return to the original path.

The choice of SNNs is justified by their natural compatibility with and direct applicability to event-based vision, unlike conventional computer vision algorithms, such as CNNs (Chen et al., 2020a; Vitale et al., 2021). SNNs are designed to process discrete, asynchronous signals and may be key in achieving compelling real-world applications of ECs. The combination of events and spiking neurons particularly holds potential for capturing temporal information relating to obstacle avoidance, such as through the decaying influence (neural activation) of an obstacle that has just been observed. Furthermore, the analog SNN dynamics can inherently induce temporal filtering properties that negate effects of insignificant events (see Section 4.3).

4.1. Overview

We designed a modular pipeline of specialized components that enable end-to-end processing of visual data into motion trajectory adaptations:

1. EC/EC Emulator: Produces event data. Either an EC or an emulator which transforms RGB data.

2. Convolutional SNN (C-SNN): Processes event data in a network of spiking neurons.

3. Obstacle Avoidance Component: Decodes SNN outputs into obstacle avoidance velocities.

4. Adaptive Motion Planner: Generates end-effector positions of a pre-planned trajectory while adjusting the plan according to the obstacle avoidance component’s output; a DMP. Positions set by the DMP are followed using a PID controller.

These components form an SNN-based obstacle avoidance module which is designed to easily integrate within an existing robot software stack.

Figure 4 depicts the processing stages and the data flow (top arrows) within the pipeline, which we describe in the following sub-sections.

Figure 4.

The main components and processing stages of our neuromorphic pipeline. The plot on the right shows a pre-planned trajectory (grey) and a resultant trajectory (green) adapted to avoid an obstacle (blue).

4.2. Event camera/emulator

The sensory input is event data that is derived from RGB camera data through an emulator. Following the fundamental EC operating principles, events can be generated from the thresholded intensity difference at every pixel between consecutive timesteps. Section 3.1 contains a review of existing emulators and their deficits, which motivate developing our $e v e n t_c a m e r a_e m u l a t i o n$ ⁶ component.

An event e_k = (x_k, t_k, p_k) is emitted at pixel position x_k at time t_k with polarity p_k ∈ { + 1, − 1} if the difference in intensities, $Δ L (x_{k}, t_{k})$ ⁷, exceeds a threshold, θ.

We form an event image representation by placing p_k of every event at the respective pixel location. The resulting event image, I_e, is a single-channel analog of the source RGB images that contains values, i_k ∈ { + 1, 0, − 1}. An event image derived from two consecutive RGB images is shown in Figure 5. The distinction of event polarity is often of little importance and OFF (−1) events are either ignored or treated the same as ON (+1) events, as we do here (and similar to Dubeau et al., 2020 and Maro et al., 2020).

Figure 5.

Emulated events due to object motion on the right. Here, any event (ON or OFF) is indicated by a blue pixel on the event image. This was captured as the camera moved forward, inducing some events on the edges of distant objects. (a) RGB Image 1. (b) RGB Image 2. (c) Event “Image”.

4.2.1. Limitations of emulation

Note that deriving events from differences in RGB frames places an upper bound on the event generation rate: the camera frame rate. This effectively negates the advantages of event asynchrony and the theoretically higher transmission rates in comparison to conventional camera pixels. Nevertheless, it provides a reasonable approximation for demonstrating and presenting elementary arguments for our approach. We verify this by comparing the output and consequent task performance of our emulator to those of a real EC in Section 7.4.

4.2.2. Filtering event noise: binary erosion

We have found that applying a binary erosion filter produces cleaner event images in cases where certain textures or surfaces (e.g. rough carpets) induce too many insignificant background events that may lead to over-reactive responses. The filter effectively removes events at which the local region does not contain sufficiently many other events.

4.3. Convolutional spiking neural network

The resulting event data is processed by a C-SNN.

4.3.1. Input data encoding

We utilize a Poisson process spike generation model to induce spikes in the input layer from incoming event images. A Poisson process presents a plausible stochastic approximation of biological neuron firing activity, in which the generation of each spike is assumed to depend on some firing rate, r, and be independent of all other spikes (Heeger, 2000).

Given a w × h event image I_e, we assign rate values according to:

I_{e}^{'} (x) = \{\begin{cases} r_{O N}, & if I_{e} (x) = + 1 \\ r_{OFF}, & if I_{e} (x) = - 1 \\ 0, & o t h e r w i s e \end{cases}

(9)

where r_ON and r_OFF represent firing rates in Hz. In our experiments, these parameters are set such that OFF events induce half the stimulation that ON events do, which is based on insights from similar works and observed performance. In particular, the same ratio is applied in Salvatore et al. (2020), while the authors of Maro et al. (2020) and Dubeau et al. (2020) do not consider event polarity. Similarly, we do not require a strict distinction between positive and negative brightness changes for the purpose of detecting a moving obstacle, and these values have been adequate in practice.

Subsequently, the spike train entering each input neuron across T is drawn from a Poisson process, resulting in sequences of spikes that follow an average firing rate but exhibit random spike timings. Figure 6 illustrates an example of a 3 × 3 event image patch and the spike trains generated at every pixel/input neuron. The result is a w × h × T binary matrix S, where each element $s_{t}^{i j}$ defines whether neuron ij receives an input spike at timestep t⁸.

Figure 6.

Poisson spike trains generated from events at 9 input neurons. Positive and negative events are shown in blue and red. Note the lower spiking frequency at negative events.

4.3.2. Convolutional network topology

The C-SNN resembles a CNN in architecture and weight sharing principles; neurons are arranged in two-dimensional layers and information is propagated through convolution operations. Here, the input to each neuron is a pre-activation computed by convolving the spiking output of neurons in the previous layer that occupy the target neuron’s receptive field and a kernel matrix of shared, real-valued weight parameters, K. This is encapsulated in the following adaptation of the convolution equation provided in Goodfellow et al. (2016), where the pre-activation of neuron (i, j) in layer k + 1, at time-step t, is computed by convolving a kernel with spike trains arriving from the preceding layer, k:

\begin{align} a c t_{t}^{i, j, k + 1} = (K * S) (i, j) \end{align}

(10)

\begin{align} = \sum_{m} \sum_{n} S (i - m, j - m) K (m, n) \end{align}

(11)

The pre-activation value adds to the neuron’s membrane potential. Figure 7 depicts a convolution operation.

Figure 7.

An illustration of convolving a 2 × 2 kernel with a 3 × 3 layer (k) of spiking neurons, whose output spike trains are depicted within each cell. Assuming a stride of 1 × 1, the result is the pre-activations of neurons in layer k + 1 (see equation (11). The summation resulting in $a c t_{t}^{0,0, k + 1}$ is shown above the cell).

Spatio-temporal distributions of spikes within an SNN lead to novel dynamics from which interesting properties may arise. The convolutional operator and the analog dynamics of the neurons create a form of spatio-temporal filter, where signals that are particularly persistent in space and time are selectively propagated. Moreover, spiking neurons possess a form of memory in the decaying potential, which reflects recent levels of stimulation. The spike trains emitted at the output neurons are used to derive obstacle avoidance in the next component of the pipeline.

We set the synaptic weights to fixed, random values. Non-plastic SNN connections have often been employed in SNN applications, such as in Risi et al. (2020) (see Section 3.2) and Milde et al. (2017) (see Section 3.4), which demonstrated the achievement of desired behaviours solely due to the properties of spiking dynamics. Similarly, we presently involve no learning, and instead investigate how robust task performance is to different random, but fixed, “features” that manifest through the randomly sampled weights. Future extensions will incorporate weight adjustment strategies through, for example, STDP and supervised variants or RL. The weight values are initialized by sampling from a standard uniform distribution, scaled by a weight factor, w_c:

W \sim U (0, w_{c})

(12)

4.4. Obstacle avoidance component

The obstacle avoidance component decodes the SNN’s output into meaningful avoidance signals. This step involves extracting indications of obstacle presence or motion from the spiking activity and deriving velocities/accelerations that can adapt the planned motion trajectory. To that end, we utilize a first-spike-time representation of spiking activity and a PF method.

4.4.1. SNN output representation

Spike trains can be represented in various temporal and rate coding schemes. We use first-spike-time (FST) temporal coding to interpret the output spiking activity (Tuckwell and Wan, 2005; Göltz et al., 2021; Liu et al., 2021). Within this scheme, the time until a neuron’s first spike after stimulus presentation fully determines the magnitude of stimulation: the earlier a neuron first spikes, the more stimulated it is. When applied to the output spike trains, the FST code provides a neural activation map, as illustrated in Figure 8. Significantly intense neural activation at a given neuron is expected to indicate the persistent presence and (relative) motion of a perceived object in regions of the input image for which that neuron is in the effective receptive field. This provides indications of avoidance directions.

Figure 8.

A depiction of the FST code applied to 25 output neurons. On the right, cell brightness corresponds to each neuron’s FST and thus activation magnitude. Note: the top-right neuron has a higher activation due to an early first spike, although the neuron just under exhibits a higher spike count.

Other common codes include the absolute or average number of spikes. Like either, the time to the first spike can indicate a neuron’s level of stimulation; however, FST necessitates flagging only the first spike, as opposed to waiting until all spikes in a given time window are accumulated. Consequently the FST code could reduce time-to-solution or energy-to-solution and thus be more efficient (Göltz et al., 2021). Note that the FSTs are calculated with respect to the time point at which the last event image was presented.

4.4.2. Computing obstacle avoidance direction using potential fields

The neural activation map can be interpreted as a downscaled version of the input event image, filtered in the space and time dimensions to indicate approximate locations of persistent obstacles (while removing potential noise). Therefore, we regard high activation regions as obstacle points in this space. In order to derive an avoidance motion vector, we utilize a method that can aggregate the obstacle points’ spatial influences and compute a direction that maximizes movement away from these points.

Artificial potential fields (PFs) are fields of attractive and repulsive forces overlaid on a robot’s environment to drive goal reaching and avoidance behaviours, respectively. We compute a PF from the neural activation map, with obstacle points set to exert repulsive forces, using the formulation presented in Park et al. (2008). For an arbitrary point on the field, x, the potential is determined by the distance to an obstacle point, p(x), according to:

U (x) = \{\begin{cases} \frac{η}{2} (\frac{1}{p (x)} - \frac{1}{p_{0}}), & if p (x) \leq p_{0} \\ 0, & if p (x) > p_{0} \end{cases}

(13)

where p₀ denotes obstacles’ radius of influence and η is a constant gain. If multiple obstacle points are perceived, the potential at point x is simply the aggregate of their contributions, that is, for n obstacle points:

U (x) = \sum_{i = 1}^{n} U_{i} (x)

. The negative potential gradient, −∇U(x), provides estimates of directions leading away from high potential regions. Figure 9 shows a PF derived from a neural activation map, yielding a gradient field (black arrows) which points away from the aggregated obstacle points (red). The mean negative potential gradient,

\tilde{ϕ} = \bar{- \nabla U (x)}

, then provides an average motion vector that incorporates all potential across the field and estimates an optimal direction for avoiding the perceived obstacles. In Figure 9(a), this is visualized as a blue arrow. (Note: it is not necessary to map the neural activation map back to the input image space; in practice, the resulting motion vector leads to effective avoidance motions).

Figure 9.

The PF computed from an SNN’s output neural activation map, visualized in 2D (9a) and 3D (9b), the latter showing the slope that determines the motion vector (blue). (a) Neural activation map to potential field. (b) 3D potential field.

The SNN output is thus decoded into motion vector $\tilde{ϕ}$ , which provides the information needed to most adequately adapt the current motion plan.

4.5. Motion planning and control

The aforementioned components operate alongside the planning and control component, which generates and executes trajectories. It also utilizes feedback from the obstacle avoidance component to adapt motion plans online by deviating to avoid obstacles while maintaining progress towards the goal; this is accomplished through a DMP. The robot follows the positions specified by the DMP planner through velocity commands generated by a PID controller.

4.5.1. Dynamic motion primitive (DMP)

Dynamic motion primitives (DMPs) can model the evolution of a point’s position over time in a set of differential equations that produce stable global attractor dynamics (see Section 2.3). A particularly useful property is the extensibility of the transformation system equations with task-related acceleration terms.

A secondary objective can be accomplished by adding appropriate acceleration values to equation (6) during the evolution of variable y. The stable attractor dynamics guarantee eventual convergence to the goal, thus enabling the plan to be adjusted through deliberate perturbations. We utilize this property to adapt DMP-planned trajectories through the obstacle avoidance component’s output: $\tilde{ϕ}$ .

Here, the DMP controls the end-effector’s position, y = [x, y, z]^T as it progresses towards goal g. The trajectory shape, that is, $w_{i}$ , is fixed, because (i) the capability of generalizing any trajectory shape to arbitrary initial and goal positions provides sufficient adaptivity and (ii) adaptation is externally achieved by incorporating obstacle avoidance feedback. Specifically, we augment equation (6) with an additive obstacle avoidance acceleration term, ϕ:

τ \ddot{y} = α_{y} (β_{y} (g - y) - \dot{y}) + f (s) + ϕ

(14)

The instantaneous value of ϕ is directly derived from

\tilde{ϕ}

. The latter expresses a motion vector in the image space which is transformed to the end-effector’s operational space:

ϕ = T_{camera}^{e e} \tilde{ϕ}

(15)

Note that $\tilde{ϕ}$ is computed from a camera image and is thus strictly two-dimensional; as a result, ϕ inherits the same constraint. For a camera mounted at the end-effector, such that the image plane is parallel to the end-effector’s y-z plane and perpendicular to the forward-facing x-axis, the consequence is that avoidance vectors are constrained to the y-z plane. This reflects the fact that the sensor lacks depth perception and thus cannot be used to compute avoidance motions that are perpendicular to the image plane. Nevertheless, avoidance behaviours resulting from this approach are sufficient for the class of tasks that we target, where the end-effector moves forwards towards a goal and the robot is expected to avoid obstacles within the field of view (FOV) of a forward-facing camera.

The resulting obstacle-avoiding trajectory is finally executed by following positions integrated from equation (14).

4.5.2. PID controller

We use a PID controller to compute velocity commands that move the robot’s end-effector between DMP positions during task execution. Given current and target positions, y(t) and y_target, the error is:

e (t) = y_{target} - y (t)

(16)

The PID velocity control is then described by:

v (t) = K_{p} e (t) + K_{i} \int e (t) d t + K_{d} \frac{d e}{d t}

(17)

K_p, K_d, and K_i represent constant gains. Figure 10 depicts a block diagram of the control system.

Figure 10.

A block diagram depicting the PID controller.

By tuning the respective gains, we can optimize for properties such as the smoothness and stability of motions. In addition, the tight integration between motion and sensor-based obstacle avoidance corrections requires motions to be responsive and reliable or risk not adequately utilizing the feedback provided by the obstacle avoidance component.

The neuromorphic pipeline described in this section was implemented using the ROS framework. Refer to Appendix D for implementation details.

5. Evaluation methodology

Our implementation was evaluated in experiments of simple reaching tasks on a Kinova Gen3 arm (shown in Figure 11) that involve static and dynamic obstacles. These were designed to compare outcomes of task executions with and without the presented SNN-based obstacle avoidance module and thus verify the merits of our neuromorphic approach. We initially tune and evaluate performance in simulations before running experiments on the robot for a final validation of the module and how well it transfers to the real world. In the following, we describe these experiments, including the formalized task scenarios, metrics, and performance criteria.

Figure 11.

The Kinova Gen3 arm configuration used in the experiments. (a) In simulation. (b) The robot platform.

5.1. Simulation experiments

We use an adapted version of the Gen3 Gazebo simulation provided in the ros_kortex⁹ software package (see Appendix D). By varying task variables such as the visual background and obstacle properties, we utilize sets of distinct task scenarios in a tuning → validation → testing procedure:

1. Tuning: manually tune pipeline parameters to establish candidate sets of parameter values.

2. Validation: evaluate tuned parameter sets to identify the best-performing set.

3. Testing: perform final experiments with the best-performing set.

5.1.1. Evaluation tasks

We formulated four goal-directed tasks for evaluating obstacle avoidance capabilities. In each task, a regular course of action leads to an imminent collision, unless a deliberate avoidance action is taken. Therefore, we address situations in which naïve motion planning is certain to fail and investigate how well our obstacle-aware manipulation trajectories solve the problem.

The set of tasks in our simulation experiments consists of:

• Task 1: The arm must reach a goal position that lies behind a static obstacle.

• Task 2: The arm must reach a goal position as a dynamic obstacle enters the FOV and crosses the end-effector’s path.

• Task 3: The arm must reach for an object on a table that is partially occluded by a static obstacle.

• Task 4: The arm must maintain its initial position as a dynamic obstacle moves directly towards it; the arm is free to move but must constantly minimize distance to the initial position.

Refer to Figures 23 –26 in Appendix A for visual illustrations of the tasks.

Within these tasks, we vary the background and the obstacle type, color, and speed. These task variables and their values are listed in Table 1 and visualized in Appendix A. We define a scenario as a task instance that is characterized by a unique set of variable values. Apart from examining the robustness of our method, testing in various scenarios is useful in identifying limitations and can provide a validation of applicability in various environmental conditions before transferring to a real robot.

Table 1.

Task variables that define experimental scenarios. Example scenario (gray cells): {Task 4, Bookstore, Box, Brick Pattern, Medium}. Approximate obstacle speeds: Low = 0.09 m/s, Medium = 0.17 m/s, High = 0.36 m/s.

5.1.2. Tuning, validation and testing

The task variables form a distribution of 416 task scenarios which we utilize for parameter tuning in addition to the final evaluation. Inspired by a common methodology in machine learning (ML) research, we draw disjoint sets of scenarios that we use to initially optimize and ultimately evaluate through a tuning, validation, and testing strategy. The tuning set is used to tune sets of parameter values to achieve adequate performance, while the validation set is used to evaluate the degree to which a given set (or “model”) generalizes to unseen conditions and to select a candidate set. Our experiments involve testing with the selected values on the testing set and comparing to executions that do not utilize the avoidance module. This eliminates biases that could result from optimizing and evaluating on the same data and ensures that we do not “overfit” to specific environmental conditions.

The tunable parameters that we consider govern the behaviour of each pipeline component outlined in Section 4 and are listed in Table 10 of Appendix E.

From the scenario distribution, we allocated 3 in the tuning set, 8 in the validation set, and 20 in the testing set. The first two were manually selected to guarantee some variation in scenario properties (with the validation set containing more variations, some unobserved during tuning) and to ensure that at least one task and one background are never observed in either phase. Testing scenarios were sampled randomly from the remaining pool, with the only restriction being at least N_s = 5 scenarios per task, resulting in a set that includes a novel task and a background/environment. In this case, these were Task 2 and the Kitchen environment (which is characterized by dim ambient lighting).

Table 7 in Appendix B lists the scenarios of each set. Note the inclusion of an additional scenario at the top (Task 1, Empty, Cracker Box), which was used to establish baseline parameter values in initial tests, termed a pre-tuning phase.

5.1.3. Evaluation metrics and criteria

The relevant literature presents scarce results of obstacle avoidance performance and limited consensus on the appropriate quantitative metrics thereof. We address this deficit by establishing a set of quantitative metrics and qualitative criteria for our evaluation.

The quantitative metrics we use to analyze the performances of parameter sets and to ultimately evaluate our approach are:

• Task execution time, T

• Trajectory length, l_Y

• Number of collisions, N_collisions

• Final distance to goal, d_G

• Success (reaching the goal and never colliding), S

• End-effector velocities and accelerations, $\dot{y}$ and $\ddot{y}$

Refer to Appendix C for detailed descriptions of each. These metrics facilitate a holistic evaluation of the SNN-based obstacle avoidance module and comparisons to baseline task executions on the basis of task success and along other dimensions that describe trajectory properties.

In addition, we define qualitative criteria to describe trajectory properties to examine the extent to which they deviate from the baseline. Ideally, adapted trajectories would be successful while remaining qualitatively similar. In order to conduct a formal evaluation, we ground each criterion in terms of quantitative measures, where possible, to avoid subjective assessments. These criteria are presented in Table 2 and described in more detail in Appendix C.

Table 2.

Qualitative performance criteria.

Criterion	Evaluated through
Reliability	Ratio of successful trials to total no. of executions in imminent collision cases
Predictability	Frequency/magnitudes of heading changes
Safety	Magnitudes of velocities and accelerations

5.1.4. Experiment procedure

Our experiments involve running N_trials executions in all testing scenarios in each of two cases:

1. Baseline executions that do not involve the obstacle avoidance module.

2. Executions that incorporate the obstacle avoidance module, using the selected parameter set.

We then analyse and compare performance using the described metrics and criteria.

5.2. Real experiments

The real experiments are conducted on a Kinova Gen3 arm attached to a mobile platform (see Figure 11(b)), to determine how well the tuned parameters transfer to the real world and for a more concrete validation of our approach. These experiments contain fewer variations of scenarios from a subset of the tasks defined in Section 5.1.1 but share the same evaluation metrics/criteria and experimental procedure.

5.2.1. Evaluation tasks

We consider two of the tasks defined in Section 5.1.1:

• Task 1: The arm must reach a goal position that lies behind a static obstacle.

• Task 2: The arm must reach a goal position as a dynamic obstacle enters the FOV and crosses the end-effector’s path.

The task setups are shown in Figures 12 and 13.

Figure 12.

Real experiment Task 1 setup. A baseline end-effector trajectory is illustrated in green. (a) Start; (b) end.

Figure 13.

Real experiment Task 2 setup. The obstacle’s trajectory and a baseline trajectory are illustrated in red and green. (a) Start. (b) End.

These tasks are equivalent to the simulated versions with a few exceptions. Firstly, the object in Task 1 was suspended but is now placed on a surface. For Task 2, the simulation offers accurate control of the obstacle’s trajectory and thus a high degree of consistency across trials. On the other hand, the object’s trajectory is controlled manually in the real experiments, which may introduce inter-trial inconsistencies, although preventative steps such as guiding markers on surfaces and periodic measurements to verify the object’s initial position were taken. While the simulated experiments were designed to gather statistical evidence from highly repeatable trials, which rarely occur in the real world, the primary objective here is to transition into real-life conditions and investigate how well the implementation adapts. Furthermore, for reasonably small variations in conditions, results from multiple trials can eliminate any variance due to these imprecisions.

The real experiment task scenarios are varied exclusively along the “Background” and “Obstacle Type” variables. “Lab Background 1” is situated in an area near a window providing a natural but dim light (see Figure 12) while “Lab Background 2” has a much brighter artificial lighting and contains a large table and other objects (see Figure 13). The obstacles are a wooden block, a metal bar, and a person’s hand. The latter is a particularly relevant case for human-robot collaborative scenarios¹⁰. Table 8 in Appendix B lists the four scenarios we conduct our real robot experiments in.

5.2.2. Evaluation metrics and criteria

The performance is evaluated using the same metrics and criteria of the simulation experiments. However, the N_collisions metric was excluded since it was only possible to evaluate in simulations by disabling collision dynamics and counting the number of instances in which the end-effector intersects an obstacle model. Instead, collisions are only reflected in the execution success, as before, which is true only if the end-effector never collides and also reaches its goal.

6. Results and discussion

6.1. Simulation experiments

Parameter tuning, validation, and testing in the simulation experiments was preceded by a pre-tuning phase, where an initial set of values was established. Next, we tuned several variants of this set on the tuning scenarios, validated their performance on the validation scenarios, and finally selected the best-performing set to run on the testing scenarios.

Figure 14 illustrates the evolution of executed trajectories during this process in a Task 1 scenario.

Figure 14.

Trajectories executed during pre-tuning and tuning trials. (a) Without SNN feedback. (b) With pre-tuning parameter set. (c) With post-tuning parameter set (12).

6.1.1. Initial parameterization (pre-tuning phase)

This phase consisted of iterative testing to search for a region in the parameter space that leads to acceptable performance. The main targets of this optimization were motion control parameters, such as the PID gains, distance tolerances, and motion loop frequency; PF parameters; SNN parameters, including the architecture, weight initialization, and SNN dynamics variables; and event emulation variables, such as RGB versus grayscale inputs and the event emission threshold (θ). These tests were conducted in a pre-tuning scenario: a variant of Task 1 containing a “Cracker Box” object that is excluded from the task distribution defined in Table 1.

For the SNN architecture, we chose a two-layer network of the LIF neurons presented in Diehl and Cook (2015). It consists of a layer that propagates input spike trains and two layers of spiking neurons, whose specifications are shown in Table 3. We set v_thresh, v_reset, v_rest, T_refrac, and τ_v to the default values of the original paper, which are intended to match biologically plausible ranges. However, we increased v_reset and v_rest (from −65.0 to −62.0) to encourage more frequent spiking. The simulation time (T_sim), which controls the time period of a single pass, was set to 20 (ms), while the weight initialization factor, w_c, was set to 7.0.

Table 3.

The SNN architecture that was selected during the pre-tuning phase.

Layer	Type	Kernel size	Stride size	Input size	Output size
Input	-	-	-	-	120 × 160
Layer 1	LIF (conv)	8 × 8	4 × 4	120 × 160	29 × 39
Layer 2 (Output)	LIF (conv)	4 × 4	4 × 4	29 × 39	13 × 18

The pre-tuning parameters were evaluated by running N_trials = 30 trials in both testing cases. As expected, the trajectories executed in the baseline case all fail by colliding with the obstacle (see Figure 14(a)). On the other hand, utilizing the module leads to trajectories that are adapted to avoid the obstacle while moving towards the goal (Figure 14(b)), which were 80% successful in these trials.

The obstacle avoiding trajectories tended to have higher execution times, T, and trajectory lengths, l_Y, approximately by factors of 2 and 1.5, respectively. In addition, these values had higher variances, reflecting variations in the executed trajectories across trials. The average number of collisions per trial, N_collisions, was reduced from approximately 7.8 to 0.6. A significant fraction of failures occurred due to executions ending before the end-effector had reached the goal, that is, $d_{G} > δ_{g}$ ,¹¹ because the applied velocities occasionally caused the arm to move into singular positions that complicate returning back to the intended path.

6.1.2. Tuning results

Using the tuning scenarios, we derived and iteratively optimized 12 parameter sets that were expected to improve results. Furthermore, we extended our implementation and parameters to address some failures that we had observed in pre-tuning and initial tuning trials.

In order to address the problem of reaching singular arm configurations, which could lead to failures and potentially unsafe executions, we implemented a safety strategy that discourages excessive motions away from the pre-planned trajectory by reducing velocities and accelerations that move the end-effector further beyond a definite safety boundary. This method is described in the box below. Note that for Task 4, the safety strategy penalizes motions that deviate a certain distance from the initial position.

Safety Strategy: If the distance of the current end-effector position from the nearest point on the pre-planned trajectory exceeds a safety threshold:

‖ y (t) - {\hat{y}}_{ref} ‖_{2} < δ_{safety,1}

(18)

and the current position is further away from the trajectory point than the last recorded position:

‖ y (t) - y_{ref} ‖_{2} > ‖ y (t - 1) - y_{ref} ‖_{2}

(19)

slow down by reducing the current commanded velocity and the ϕ values that define the next acceleration values:

v (t) = γ_{v, 1} v (t)

(20)

ϕ (t + 1) = γ_{a, 1} ϕ (t)

(21)

We also added optional workspace boundaries: positional limits beyond which a planned DMP position is rectified by clipping its value:

y_{i} = m i n (δ_{pos,i}^{+}, m a x (δ_{pos,i}^{-}, y_{i} (t))), \forall i \in {x, y, z}

(22)

Here,

δ_{pos,i}^{-}

and

δ_{pos,i}^{+}

represent lower and upper positional limits along dimension i (set to −∞ and ∞ by default). This addresses safety concerns due to the end-effector colliding with the edge or moving under the table in Task 3, but could be set for different task conditions. Otherwise, various tunable parameters (see Table 10 of Appendix E) were optimized to form the 12 candidate parameter sets.

We ran batches of trials without SNN feedback (baseline) and with SNN feedback, the latter with each of the parameter sets, on each of the three tuning scenarios (1, 2, and 3). The baseline batches consisted of N_trials = 30 trials, while the rest consisted of N_trials = 40 each for a total of 1530 trials.

The trajectories executed in scenario 1 are shown in Appendix H in Figure 34.

Scenario 1 trials confirmed that the safety strategy reduced instances of terminal arm configurations. Some parameter sets (particularly 6-9) exhibited less reactive motions due to stronger event erosion filter parameterizations, which lead to more collisions. Similar failures were observed in scenario 2 trials for the first nine parameter sets, which were also attributed to a weak response due to parameters that control sensitivity to inputs, including the event emission and filter thresholds, neuronal spiking thresholds, and ϕ acceleration parameters. Sets 10-12 were tuned to address this limitation and were successful in improving results. Scenario 3 trials revealed that earlier parameter sets suffered from the opposite: a higher reactivity lead to unstable motions. These observations point to a trade-off: increasing sensitivity to inputs at the perceptual, motion, or intermediate levels in the pipeline may lead to excessive, dangerous or oscillatory motions, while decreasing sensitivity may risk not reacting fast enough to avoid collisions.

6.1.3. Validation results

Next, we selected a subset of the parameter sets on the basis of best average performance and trajectory properties: 5, 8, 10, and 12.

Due to initial observations and unsatisfactory performance in a challenging, high-speed validation scenario (8), we instantiated two additional sets: 13 and 14. These were aimed at exploring quicker responses and faster real-time performance, mainly by testing a single-layer SNN and tuning motion loop frequency parameters.

We ran N_trials = 30 baseline trials and N_trials = 40 trials with the avoidance module parameterized by each of the six parameter sets. All trials were repeated in each of the eight validation set scenarios (4-11): a total of 2160 trials.

Figure 36 in Appendix H contains the metrics plots summarizing the quantitative performance in all scenarios.

Results from Task 1 scenarios (4-6) indicated that the module performed worse in the “Office” environment (scenario 4) than in “Store” or “Empty”. A likely cause is the relatively lower illumination, which decreases contrasts between the background and the obstacle, thus leading to less events being generated and in turn more latency in or lower magnitudes of avoidance velocities. Another potential reason is the relative background clutter which, despite event filtering, may produce more background events that saturate the overall response of the SNN.

Results from Task 4 scenarios (7-9) varied significantly. Most sets performed well with medium-speed obstacles (scenario 7) but considerably worse in response to the high-speed obstacle of scenario 8 (traveling about twice as fast). While the module would react to the obstacle, which was visible for a shorter amount of time, the eventual response would not be sufficiently effective. This was a product of fewer events, a resulting lower SNN activation, and the limited speed of the arm. Sets 13 and 14 yielded more positive results through stronger responses, leading to less predictable and potentially unsafe trajectories. In addition, these improvements did not extend to the low-speed case.

All selected parameter sets performed at least better than the baseline and reasonably well in most scenarios. However, the last findings indicate a limit on how fast we can command the arm to move before approaching dangerous speeds. While sets 13 and 14 improved performance, they exhibited significantly higher accelerations that enable the necessary sudden reactions. This is characteristic of unsafe trajectories and presents an undesirable compromise. Instead, it is reasonable to acknowledge an inability to reliably react to and avoid obstacles whose speeds exceed a certain threshold.

6.1.4. Testing results

Parameter set 12 was selected for the experiments based on its superior average performance. Following the same procedure, we ran batches of N_trials = 30 trials without SNN feedback (baseline) and N_trials = 40 trials with SNN feedback (set 12), on all testing scenarios (12-31) for a total of 1400 trials. Figure 37 in Appendix H shows the quantitative results from all scenarios. Table 4 contains the mean metric values in each scenario for the with SNN feedback case, along with the task averages (across all scenarios of the same task).

Table 4.

Quantitative results of the testing phase runs, averaged over the N_trials trials of each scenario.

	Scenario No.	Results (averages over N_trials trials)
	Scenario No.	Success, S (%)	Collisions, N_collisions	Dist. to Goal, d_G (m)	Traj. Length, l_Y (m)	Execution Time, T (s)
Task 1	12	95.0	0.625	0.026	0.965	12.200
	13	95.0	0.175	0.027	1.008	12.875
	14	95.0	0.125	0.027	0.995	12.997
	15	90.0	0.225	0.027	0.936	12.297
	16	87.5	0.450	0.027	0.926	12.260
	Task Average:	92.5	0.320	0.027	0.966	12.526
Task 2	17	42.5	1.175	0.027	0.665	9.524
	18	32.5	1.650	0.027	0.655	9.590
	19	77.5	0.425	0.076	1.061	14.546
	20	85.0	0.250	0.028	1.157	14.734
	21	67.5	1.500	0.027	0.713	10.063
	Task Average:	61.0	1.000	0.037	0.850	11.691
Task 3	22	70.0	3.400	0.028	1.034	13.924
	23	57.5	3.825	0.028	1.010	13.670
	24	82.5	1.450	0.028	1.317	16.500
	25	85.0	1.725	0.028	1.247	15.690
	26	77.5	1.775	0.029	1.154	14.674
	Task Average:	74.5	2.435	0.028	1.152	14.892
Task 4	27	95.0	0.250	0.016	0.901	17.547
	28	0.0	2.375	0.027	0.421	7.780
	29	47.5	1.875	0.028	0.359	6.865
	30	95.0	0.200	0.008	0.420	10.320
	31	92.5	0.425	0.007	0.433	10.179
	Task Average:	66.0	1.025	0.017	0.507	10.538

The avoidance module succeeded in 92.5% of all Task 1 trials. The success rate, distance-to-goal, trajectory length, and execution time exhibited low variances, indicating a high level of consistency over the five environments, including the novel, low-light setting of scenario 14.

Performance was less consistent in the novel Task 2, where success rate had an average and standard deviation of 61% and 22.6%, respectively. Most failures were observed in the “Office” scenarios 17 and 18, where the relatively low mean trajectory length and execution time indicated less movement, possibly due to lower neural activations. The avoidance behaviour succeeded more often in the remaining scenarios and was not affected by obstacle speed.

Success in Task 3 scenarios averaged at 74.5% and had a standard deviation of 11%. While the end-effector always reached the goal, N_collisions varied significantly across trials of a given scenario and we observed no correlations between the different task variable values and the frequency of failures. Occasionally, initial trajectory adaptations moved the end-effector to a region closer to the obstacle from which subsequent corrections were unlikely to effectively steer it away. These occurrences indicate some uncertainty in the resultant trajectory which may be attributed to the cascade of non-linear operations performed within the pipeline.

The reported mean success rate in Task 4 (66%) is skewed due to failures in the high-speed obstacle scenario 28 (the median is 92.5%). This reinforces the validation phase findings concerning very fast obstacles.

We computed the average time that elapses in a single iteration of each pipeline stage and present the results in Table 5. The computation times of the first two stages were also measured when the arm was stationary and no motion was induced in the image. We observed that SNN computation time decreased in the latter case, that is, when no input spikes are induced. This indicates a positive correlation between the amount of SNN computations and the number of input events: a measure of new perceptual information. Considering only visual changes to be relevant for the avoidance behaviour, the SNN thus expends computations only to process salient information in the context of the task. Events efficiently encode this information and the SNN avoids unnecessary computations (i.e. computing and propagating spikes) in the absence of input spikes. This input-dependent computational property distinguishes SNNs from conventional DNNs and has potential for better power and time efficiency in some applications.

Table 5.

Mean computation times (in seconds).

Stage	Computation time (s)
Stage	During executions	No stimuli
Event Emulation	0.025 ± 0.001	0.025 ± 0.001
SNN Simulation	0.123 ± 0.014	0.086 ± 0.007
Obs. Avoidance Computations	0.002 ± 0.001	-
Total	0.150 ± 0.016	0.111 ± 0.009

Finally, we evaluated executions based on our qualitative criteria (refer to Appendix C for descriptions of each).

We correlate reliability with the consistency of positive results (i.e. success rates) in consistent task conditions. Each row in Table 4 represents the success rate in a given set of controlled conditions, the average of which is 74%, including the scenario 28 outlier. The median, which is less influenced by the outlier, is 84%, and success rates were higher in half of the scenarios. Overall, this indicated the moderate reliability of the implementation and chosen parameter values.

We observed a high level of predictability in motions by analyzing the magnitudes of directional changes. The estimated values of heading angular velocities, $\dot{ζ}$ , had similar distributions in both cases and fell in the [−2,2] deg/s range. (A plot of these distributions for three scenarios is provided in Appendix G in Figure 32.) This suggests that the evolution of the trajectory is easy to predict from observations.

For evaluating the safety of obstacle-avoiding trajectories, we measured the overall end-effector speeds, which averaged at $\sim 0.07 m / s$ . According to the ISO standard for industrial robot safety requirements (ISO 10218-1:2011, 2011), this is well below a tool speed threshold under which a robot can be considered to operate in a safety-rated, “reduced speed control” mode (Beckert et al., 2017): 0.25 m/s.

6.2. Real experiments

The pipeline implementation and tuned parameter values (12) were directly transferred to a real Kinova Gen3. Preliminary tests showed acceptable performance except for a slight degradation in motion smoothness, which necessitated minor adjustments of 5 of the 26 parameters:

• K_p : 5.0 → 2.0; K_d: 10.0 → 5.0; K_i: 0.0 → 5.0

• δ_y : 0.01 → 0.02

• θ : 28 → 45

The controller gains were a primary cause, which verifies expected discrepancies between the simulated and real-world dynamics (mainly in the actuators). Transitions between trajectory positions were less abrupt after slightly increasing the position reaching tolerance, δ_y, contributing to smoother motions, which similarly highlighted a discrepancy in actuator dynamics. The emulator generated more events for similar motions, visual conditions, and distances to objects, indicating that real-world images contained more colour variations, contrasts and possibly noise. By increasing emission threshold, θ, we effectively offset this naturally larger variation in RGB data, thus producing more similar event densities and, by extension, SNN activations and avoidance motions to those executed in the simulation.

We ran N_trials = 30 trials in each scenario, which are listed in Table 8 in Appendix B, with and without the module.

Figure 15 contains the quantitative results of these experiments from each scenario (R1-R4) in addition to a single batch of baseline trials¹². Table 6 contains the mean values of each metric and the averages for each task.

Figure 15.

Quantitative metric results without versus with SNN feedback: real experiment scenarios R1-R4.

Table 6.

Quantitative results of the real experiment runs, averaged over the N_trials trials of each scenario.

	Scenario ID	Results (averages over N_trials trials)
	Scenario ID	Success, S (%)	Dist. to Goal, d_G (m)	Traj. Length, l_Y (m)	Ex. Time, T (s)
Task 1	R1	90.0	0.026	1.151	7.885
	R2	93.3	0.024	0.915	6.482
Task 2	R3	100.0	0.026	0.908	6.374
	R4	66.7	0.027	0.915	6.492
	Task Average:	87.5	0.026	0.972	6.808

The avoidance module succeeded in 87.5% of all 120 trials. In the Task 1 scenario, the arm was 90% successful in avoiding the wooden block. In Task 2, the arm was most successful with the metal bar obstacle (R3), as it never failed the task, followed by the hand (R2) with a success rate of 93% and the wooden block (R4) with 67%. Since the goal was always reached, all failures were due to collisions.

The higher failure rate in R4 was due to the visual properties of the wooden block, which blended with the background (see Figure 16(a)). As the obstacle comes into view, it generates less events and less neural activation than the more contrasting objects, causing trajectory adaptations to occasionally be too weak or late to effectively avoid the obstacle. The perfect success in R3 could be explained by the significantly higher color contrast (see Figure 16(b)). Therefore, similar colors induce lower intensity differences that lead to a less effective response for less visible objects.

Figure 16.

Images captured by the onboard camera during an execution of scenarios R4 and R3 (Task 2). (a) Scenario R4. (b) Scenario R3.

When comparing the baseline and adaptive cases, we observed similar distributions of accelerations and velocities, which are shown in Figure 17. However, magnitudes in y were marginally higher in the second case. This indicates a preference for side-ways motions, that is, in the y direction (left-right axis, relative to the camera), which is expected due to the avoidance velocity vectors being computed from the camera’s image space, ruling out motions in x (front-back axis), as expressed in Section 4.5.1. Accelerations and velocities in x tended to be lower with the avoidance module, indicating that the trajectory adaptations naturally slow down forward motion when avoiding perceived obstacles, which is a desirable effect when aiming to safely clear an obstacle while continuing progress towards a goal.

Figure 17.

Distributions of instantaneous velocities and accelerations in each spatial dimension, measured during real robot experiments. The figure contains data from nominal executions (without SNN Feedback) and scenarios R1-R4. (a) Velocities. (b) Accelerations.

The trajectories executed in these experiments (plotted in Figure 35 in Appendix H) were qualitatively similar to the simulation counterparts and lead to similar conclusions from the qualitative evaluation.

The average success rate was 87.5% (the median was 91.7%), indicating a moderately high level of reliability.

The distribution of $\dot{ζ}$ spanned low values, similar to the baseline, which indicate that the end-effector did not exhibit sudden, large changes in direction. (The values for R2 are plotted in Figure 33 in Appendix G.) Therefore, the module generated predictable motions on the real robot as well.

The same safety comparison to the reference velocity threshold (0.25 m/s) indicated that trajectories were fairly safe, albeit to a less degree than in the simulation. While the overall end-effector velocity averaged at 0.13 m/s, the Task 1 and Task 2 maximums approached 0.5 m/s and 0.55 m/s. However, these values lay beyond the third quartile of the distribution (upper plot in Figure 17(a)). We can assume that this is not an effect of the avoidance maneuvers themselves, but rather the controller parameterization (since similar values are observed in the Without SNN Feedback case).

Overall, the SNN-based obstacle avoidance module demonstrated 84% and 92% median success rates in simulated and real experiments. Execution times, trajectory lengths, and velocity and acceleration magnitudes indicated that the adaptive trajectories were quantitatively similar to baseline executions, but with the added capability of consistent obstacle avoidance. The qualitative assessment suggested that these trajectories were adequately reliable, predictable, and safe, although they were directly optimized only for obstacle avoidance. Finally, we demonstrated comparable performance in the real domain.

7. Further analyses

We additionally explored properties of our neuromorphic components through further experimentation. We first compared different event emulation methods in terms of output events, spiking responses, and effects on performance. To validate the SNN’s utility, we excluded the component from the pipeline and tested the decoding of avoidance behaviour from raw events. Furthermore, we investigated the effects of random SNN weight values on performance by repeating scenario trials with different initializations. Finally, we substituted our event emulator with a real EC and repeated experiments on the robot.

7.1. Comparing event emulation methods

For this analysis, we compared five emulation strategies:

• M1: Multi-C RGB absolute differences

• M2: Multi-C RGB absolute differences, with blur

• M3: Multi-C RGB log differences

• M4: “Salvatore” method

• M5: “pydvs” method

Our default strategy, M1, involves computing the differences in absolute intensities at every pixel, ΔL_abs(x_k, t_k), and emitting an event wherever this exceeds θ in all three color channels (which we term a multi-channel condition).

For the second method (M2), we blur the source images before computing ΔL_abs(x_k, t_k). This is similarly applied in Zahra et al. (2021) for removing high-frequency noise; here, we aim to investigate its effects on derived events. This is accomplished by convolving images with a low-pass filter represented by a normalized 5 × 5 box kernel.

M3 emits events according to differences in log intensities:

Δ L_{\log} (x_{k}, t_{k}) = l o g (L (x_{k}, t_{k})) - l o g (L (x_{k}, t_{k - 1}))

(23)

This mimics how most ECs measure irradiance changes, resulting in their characteristically high dynamic ranges (Rebecq et al., 2019; Gallego et al., 2022), and is often used in event emulation (Rebecq et al., 2018).

The fourth method (M4) is based on a strategy employed in Salvatore et al. (2020), where the log of a sum of weighted channel intensities is used to compute the difference:

\begin{align} Δ L (x_{k}, t_{k}) & = l o g (0.299 L_{R} (x_{k}, t_{k}) + 0.587 L_{G} (x_{k}, t_{k}) \\ + 0.114 L_{B} (x_{k}, t_{k})) - l o g (0.299 L_{R} (x_{k}, t_{k - 1}) \\ + 0.587 L_{G} (x_{k}, t_{k - 1}) + 0.114 L_{B} (x_{k}, t_{k - 1})) \end{align}

(24)

Finally, M5 mimics the pyDVS emulator (García et al., 2016), which encodes pixel intensities using Gamma functions, instead of absolute or log values.

While all methods could be run interchangeably, it was necessary to adjust some θ thresholds.

Firstly, we visually compared events and SNN responses. Three pairs of consecutive images that were captured during a scenario R3 trial were used to generate event images using each method. These were then input to the SNN, from which we recorded the spike trains output in a period of T_sim = 40 ms. Appendix F contains the event images (Figure 30) and plots that depict the resulting spike trains and output neuron spike counts across T_sim (Figure 31).

We observed subtle variations in the emulated event data. The blurring applied in M2 leads to similar images but additionally eliminates seemingly spurious events. The event distributions appear notably different in M3; for instance, darker regions produce significantly more events than brighter regions. This is likely due to the higher sensitivity of the log-based measure to darker (lower intensity) colors. The other log-based method, M4, showed less dark/bright differences and its event data was denser in the most relevant regions (containing obstacle motion). Results from M5 measure look fairly similar to M2’s, except for a slight reduction in apparent noise. Similar to the first two methods, M5 reacts strongly to reflective surfaces, such as one near the center of the frame, which produce high and unstable intensities.

The SNN responses largely matched the input events. For example, M1-M4 caused significant spike counts for the first sample in the first few neurons (by index); these neurons correspond to upper regions of the image, which contained strong event activity. Although the blurring in M2 produced less noisy events than M1, both resulted in very similar spike distributions. A likely reason is that the SNN’s dynamics filter out spurious inputs anyway, which indicates that the SNN could obviate the need for such a denoising operation. Generally, the SNN’s responses were similar across methods (except for M5), showing a degree of robustness to variations in events (see in Figure 31, for example, the spike trains concentrated in the top and bottom regions of the plots for sample 3 across methods). The ultimate obstacle avoidance response would also be similar due to our FST decoding strategy: we consider neurons that fire before t_act (depicted as a yellow line in the plots) active and indicative of obstacle points, and neurons that spike first are generally around similar positions across methods.

We repeated N_trials = 40 trials in scenario 30 for each method to compare the resultant obstacle avoidance performance through our quantitative metrics, which are plotted in Figure 18. The results were not significantly affected by the emulation method; M5’s success rate of 85% deviated most notably from the mean (94%). This verifies that the resultant behaviours are fairly similar, as we would expect from the similarity in SNN responses.

Figure 18.

Quantitative metric results for testing scenario 30: without SNN feedback and four batches, each with a different event emulation method: M1 (default) to M5.

The spiking responses and task performance we observed indicate that our SNN is robust to differences in the emulated event data, since they are inherently tolerant to some noise or variance. A different SNN output decoding strategy that, for example, considers more than neurons’ first spike times may result in more nuanced and varied responses and behaviours between the EC emulation methods and is thus worth exploring in future work.

7.2. Decoding avoidance behaviour from raw event data

We experimented with decoding trajectory adaptations directly from raw events, instead of spikes output by the SNN in response to events. Additionally, we tested decoding random events, in order to verify that the presented results are indeed due to the information contained in the event data and the subsequent SNN processing.

In our approach, the output first-spike-times define the neural activation map which contains the obstacle points in the output feature space. To decode raw events directly, we resized the event images to match the SNN’s output feature space (using bilinear interpolation) and designated events as the obstacle points. The PF-based decoding procedure that follows is otherwise identical, resulting in the ϕ accelerations that in turn dictate trajectory adaptations.

To quantitatively evaluate effects on task performance, we repeated N_trials = 40 trials in scenario 25, while (i) decoding from raw events and (ii) decoding from random events.

In both cases, the robot failed in all trials and would often oscillate due to the accelerations induced by the raw events and never reach the goal, despite subsequent efforts to re-tune parameters. We illustrate the metric results in Figure 19.

Figure 19.

Quantitative metric results for testing scenario 25: without SNN feedback, with SNN Feedback, and decoding obstacle avoidance behaviour from raw or random events.

The inadequacy of raw event data indicates that the neural dynamics of the SNN are integral for processing that data to achieve successful trajectory adaptations. The similarly negative results obtained from either random events or the raw events further confirm the insufficiency of the event data in our approach and the importance of the SNN.

7.3. Testing effects of SNN weight initializations

In all experiments, we constrained the SNN weights to a set of random values to eliminate the variance in results. To investigate their influences, we analyzed performance with different random weight initializations.

We ran eight batches of N_trials = 60 trials in scenario 31, initializing the SNN with a different set of weights in each.

Figure 20 shows the metrics plot (note that “Seed 1” refers to the batch from the original experiments). The success rate varied around 90.5% with a standard deviation of 3.5%, indicating that the weight values have a non-negligible, but not excessive, effect on performance. Except for occasional outliers, the other metrics’ distributions were similar across weight initializations. No differences between the resultant trajectories were perceptible from visual observations.

Figure 20.

Quantitative metric results for testing scenario 31, repeated with eight random SNN weight initializations.

The limited variation in the results was not significant enough to invalidate previously established conclusions and, in fact, has positive implications for applications of learning. In particular, the observed relevance of the weights motivates searching for values that could improve performance, for example, through optimization and learning algorithms to tune SNN weights; an interesting avenue for future work.

7.4. Testing real event camera data

We had utilized event emulation within our pipeline with the hypothesis that our conclusions can be extended to a system equipped with a real EC. As an initial validation step, we integrated and conducted preliminary tests with an EC.

We used a DAVIS346 EC, which contains a dynamic vision sensor (DVS) and active pixel sensor (APS) that enable capturing event and RGB data, respectively. The DAVIS346 has a relatively large pixel size of 18.5 μm² and a resolution of 346 × 260. By comparison, the Omnivision OV5640 sensor mounted on the Kinova Gen3 has a pixel size of 1.4 μm² and a resolution that ranges from 320 × 240 to 2592 × 1944. However, the DAVIS has a significantly higher dynamic range (120 dB vs 68 dB). In addition, the OV5640 consumes $\sim 700 m W$ while the DAVIS consumes only 10-30 mW to transmit event data and an additional 140 mW if the APS is active, for a 5V DC supply¹³. The DAVIS was attached to the top of the end-effector for these tests.

Figure 21 depicts the events generated for four frames of a simple hand motion by the emulator and the DAVIS. Evidently, the events are spatio-temporally similar, but the DAVIS is more sensitive to minute motions and produces significantly more salient events. On the other hand, its output is notably noisier, but may be alleviated with careful tuning of the camera’s parameters.

Figure 21.

A comparison of events generated by our emulator and the DAVIS346. Note: RGB images captured by the DAVIS346 were used to generate the emulated event images.

For evaluating the DAVIS’s performance, we repeated executions of scenario R3 trials. The camera parameters, particularly the emission thresholds, were moderately adjusted to induce SNN responses of similar magnitudes to those induced by the emulator. However, the DAVIS produced denser and noisier event data on average, which necessitated diminishing sensitivity to the event data elsewhere in the pipeline. To that end, we modified the values of (i) the limit on instantaneous avoidance accelerations, ϕ_max and (ii) the binary erosion filter size, s_BE. N_trials = 30 trials were executed and compared to the previous trials.

Figure 22 illustrates the quantitative results with event emulation and with the DAVIS346. Most significantly, we observed a similarly high success rate with the EC, despite three observed failures, two of which were light touches of the obstacles. Ultimately, the minimal tuning of the DAVIS’s parameters led to similar results and trajectory shapes.

Figure 22.

Quantitative metric results for scenario R3 with results when using the event emulation (obtained during the experiments discussed in Section 6.2) and the real EC.

These tests show the seamless substitution of the emulator with an EC. The similarity in task performance indicates that our experiment conclusions could be extended to a system incorporating a real EC, and provides some validation for using emulation as a substitute in research. Nevertheless, testing in more scenarios may yield more insights, especially since the relatively large number of camera parameters have not been fully explored. The small decrease in task success may be due to a sub-optimal tuning of the DAVIS parameters, in contrast to the emulator’s parameters, which underwent a more rigorous refinement process during our tuning phase.

8. Conclusion

We presented and evaluated a neuromorphic approach to obstacle avoidance on a manipulator with a single onboard camera. Our pipeline transforms visual inputs into corrective obstacle avoidance maneuvers, combining high-level trajectory planning and low-level reactive adjustments using DMPs. We utilized event-based vision and spiking neural networks for neuromorphic sensing and processing capabilities. Simulated and real experiments have demonstrated its success in achieving real-time, online obstacle avoidance across various task scenarios, proving its utility over a non-adaptive baseline. The adaptive trajectories minimally deviated from the baseline and were often predictable, safe, and reasonably smooth.

Further analyses highlighted useful properties of the SNN, such as robustness to variations/noise in event data. The proportionality of computations to input magnitudes, as evidenced by the time analysis, validates two suppositions: the SNN’s tendency to exclusively process relevant information, which can save energy and time, and the homogeneity of EC data and SNNs. Since the relevant information for the problem we address is relative input change, events efficiently distill this information and the SNN selectively performs computations only on those portions of the visual inputs. Additionally, the SNN weight analysis motivated future applications of learning-based optimization. Finally, we verified the compatibility of our implementation with a real EC.

8.1. Limitations

We noted a limited capability of avoiding high-speed obstacle collisions, which could be addressed by increasing sensitivity but at the cost of compromising safety. Therefore, we presently acknowledge a fundamental limitation in reliably reacting to fast obstacles due to physical constraints. Additionally, two factors that seemed to occasionally cause failures were dimmer lighting, which lead to less contrasts between foreground and background, and clutter, which could contribute to more background events that saturate the SNN response, possibly overwhelming localized responses at obstacle positions.

We presented a set of well-formulated qualitative evaluation criteria, for which we selected specific quantitative measures to evaluate each. However, since concepts like predictability can be interpreted in different ways, a better approach could aggregate multiple measures, including from user data, to evaluate each criterion.

The method discussed in this paper does not provide a mechanism for distinguishing obstacles from objects that the robot must interact with, since it is intended to solely enable fast, reactive obstacle avoidance (which could be considered a low-level behaviour). The ability to recognize goal-relevant objects could be delegated to a higher-level planning component that adapts the behaviour of the avoidance module according to task-specific knowledge, such as an object to be grasped. In this case, a simple adaptation could be to dampen or cancel the effects of the avoidance module when the input data corresponds to the location of the task object. Therefore, we currently assume that this method is augmented by such planning components.

As mentioned in Section 4.5.1, our obstacle avoidance is currently restricted to motions in two dimensions that align with the camera’s image plane. While this is sufficient for the targeted type of tasks, other situations may require more general motions. Although it is outside the scope of this work, it is worth noting that this could be addressed through a novel method for recovering depth information from distributions and magnitudes of local events in event images or using stereo vision (such as in Zhou et al., 2021b, and Risi et al., 2020), which is often applied in conventional camera systems to achieve some depth perception and is biologically plausible in principle.

While this work focuses on the design, implementation, and systematic evaluation of the novel neuromorphic pipeline itself, with obstacle avoidance being a representative application, we do not experimentally compare to conventional obstacle avoidance methods, which could be useful in validating our method. Such a comparison to existing, non-neuromorphic methods can be challenging due to the drastically different sensor configurations (refer to Section 3.4 for discussions of examples) and approaches to collision avoidance that they apply. In particular, most reviewed methods employ multiple sensors on and around the robot, most commonly RGB-D cameras, compared to our single, biologically-inspired, onboard event camera, which presents a novel paradigm.

8.2. Future work

An interesting extension is to incorporate learning for adjusting the SNN weights and optimizing the hyperparameters of the pipeline. Given the results of our random weights analysis, we expect synaptic weight tuning to have a positive impact on avoidance behaviour. The best method to train an SNN remains an open question, but we could consider STDP or surrogate gradient methods. Other approaches include liquid-state machines (Ponghiran et al., 2019) and applying a simple rule such as linear regression on a representation of output spiking activity (Michaelis et al., 2020).

Although our manual hyperparameter tuning procedure leads to interpretable results, it could limit applicability to different tasks and situations. This could be substituted by a learning or optimization algorithm, such as RL or evolutionary optimization, with a formalized objective function, thereby avoiding the tedium of manual tuning and potentially yielding better parameter values. In addition, these optimization algorithms could be deployed online to adapt and adjust parameters to different tasks and situations.

Given preliminary results with the DAVIS346, a next step is to further explore and utilize the capabilities of the event camera.

We presented additional analyses of particular components in Section 7, which could be extended further with ablation studies. These studies could involve conducting experiments that enable studying and evaluating the neuromorphic components more intensively and potentially drawing more interesting connections to their biological counterparts.

Our qualitative evaluation can be improved by conducting user studies in which executions are evaluated by naïve subjects through questionnaires designed to elucidate the general perception of the robot’s behaviour.

We noted some uncertainty in trajectory outcomes in one of the simulation tasks (3). The slight variations in results are due to the complex interactions and data flow within the pipeline stages. In order to better understand these interactions, we could formulate experiments designed to quantify the effects of input variations, the results of which may explain the observed differences in trajectories.

We have only considered collisions of the end-effector throughout this work, but we could augment our trajectory adaptation approach to incorporate avoidance of collisions with the rest of the robot’s body. For example, one could incorporate information on the robot morphology in the replanning of motion trajectories, which could be achieved more easily in the case of DMPs by representing trajectories in joint space (not end-effector space) or using other motion planners and feeding back obstacle information to ultimately ensure that all parts of the robot body maintain distance from perceived obstacles. In addition, higher-level planning behaviours can be implemented to maintain a memory of perceived objects around the workspace and to execute motions whose purpose is to perceive and “remember” object locations even when they are no longer within the field of view (similar to mapping the environment), particularly since EC data relies on relative motion.

The full extent to which the SNN provides processing speed, power consumption, and other improvements can be studied only when the networks are run on neuromorphic hardware. Naturally, a potential extension is thus to explore the integration of a neuromorphic processor as a step towards a more fully neuromorphic processing approach.

Finally, we have demonstrated our approach in the domain of robot manipulation. In future work, we can apply our implementation to a navigation scenario, for example. The results of experiments in a different domain can provide further validation of the neuromorphic concept we have developed and evaluated in this paper.

Footnotes

Acknowledgements

We gratefully acknowledge the support by the b-it International Center for Information Technology.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Ahmed Abdelrahman

Matias Valdenegro-Toro

Maren Bennewitz

Notes

Appendix A. Experimental scenario variables

This appendix contains visualizations of the tasks and variables (backgrounds, obstacle types, and obstacle colors) that define the scenarios of the simulation experiments.

Appendix B. List of Experiment scenarios

This appendix contains the lists of task scenarios that were used in the simulated and real experiments in Tables 7 and 8. Table 7.

List of tuning, validation, and testing scenarios.

Scenario ID	Set	Scenario specification
0	Pre-Tuning	Task 1, Empty, Cracker box
1	Tuning	Task 1, Store, Red Box
2	Tuning	Task 4, Store, Brick, Rock, Medium Speed
3	Tuning	Task 3, Empty, Brick
4	Validation	Task 1, Office, Red, Buckyball
5	Validation	Task 1, Office, Yellow-Black, Rock
6	Validation	Task 1, Empty, Brick, Spiky Sphere
7	Validation	Task 4, Empty, Yellow-Black, Box, Medium Speed
8	Validation	Task 4, Office, Yellow-Black, Buckyball, High Speed
9	Validation	Task 4, Store, White, Spiky Sphere, Low Speed
10	Validation	Task 3, Office, White
11	Validation	Task 3, Store, Yellow-Black
12	Testing	Task 1, Empty, Yellow-Black, Spiky Sphere
13	Testing	Task 1, Store, Brick, Buckyball
14	Testing	Task 1, Kitchen, White, Buckyball
15	Testing	Task 1, Empty, Yellow-Black, Box
16	Testing	Task 1, Store, Yellow-Black, Rock
17	Testing	Task 2, Office, Red, Buckyball, High Speed
18	Testing	Task 2, Office, Red, Rock, Medium Speed
19	Testing	Task 2, Store, Brick, Box, Medium Speed
20	Testing	Task 2, Kitchen, Yellow-black, Box, High Speed
21	Testing	Task 2, Empty, Red, Spiky Sphere, Medium Speed
22	Testing	Task 3, Empty, Red
23	Testing	Task 3, Empty, Yellow-Black
24	Testing	Task 3, Kitchen, White
25	Testing	Task 3, Kitchen, Red
26	Testing	Task 3, Store, Red
27	Testing	Task 4, Store, Yellow-Black, Buckyball, Low Speed
28	Testing	Task 4, Store, Red, Buckyball, High Speed
29	Testing	Task 4, Empty, Yellow-Black, Spiky Sphere, Medium Speed
30	Testing	Task 4, Empty, Yellow-Black, Box, Low Speed
31	Testing	Task 4, Empty, Brick, Buckyball, Low Speed

Table 8.

List of real robot experiment task scenarios.

Scenario ID	Set	Scenario specification
R1	Testing (Real)	Task 1, Lab Background 1, Wooden Block
R2	Testing (Real)	Task 2, Lab Background 2, Hand
R3	Testing (Real)	Task 2, Lab Background 2, Metal Bar
R4	Testing (Real)	Task 2, Lab Background 2, Wooden Block

Appendix C. Evaluation metrics and criteria details

This appendix contains more detailed descriptions of the quantitative metrics and qualitative evaluation criteria presented in Section 5.1.3.

Appendix D. Implementation details

The components described in Section 4 were implemented as ROS nodes, using the ROS Noetic Ninjemys distribution and Python 3.8 for development. This section summarizes details of our the ROS stack and the augmented Gazebo simulation of the Kinova Gen3.

Appendix E. Pipeline component Parameters

Table 10 lists and describes the main parameters of our neuromorphic pipeline, divided by component. Table 10.

The parameters of each pipeline component. The (*) indicates a parameter which was not tuned in our experiments.

Component	Symbol	Description
Event Camera Emulator	θ	Event emission threshold
Event Camera Emulator	s _BE	Binary erosion filter structure size
Convolutional SNN	w _c	SNN weight initialization constant
	T _sim	SNN simulation time period
	v _thresh	SNN potential spiking threshold
	T _refrac	SNN refractory spiking period
	τ _v	SNN potential decay constant
	v_rest (*)	SNN resting potential
	v_reset (*)	SNN reset potential
Obstacle Avoidance Component	ϕ _max	Upper limit on ϕ
	n _ϕ	ϕ history horizon
	η	Constant gain (PF; Park)
	C _δ	Gradient constant factor (PF; Park)
	p₀ (*)	Obstacle radius of influence (PF; Park)
	t_act (*)	FST activation threshold factor
Motion Planning and Control	δ _y	Position reaching tolerance
	δ _obs	Obstacle avoidance distance tolerance
	δ _safety,x	Safety strategy distance tolerance¹⁴
	γ _v,x	Safety strategy velocity reduction factor
	γ _a,x	Safety strategy acceleration reduction factor
	$δ_{pos,i}^{+}$	Upper positional limit (dimension i)
	$δ_{pos,i}^{-}$	Lower positional limit (dimension i)
	K_j (*)	Controller gain (j ∈ {P, I, D})
	δ_g (*)	Goal reaching tolerance

Appendix F. Event emulation comparison visualizations

This appendix contains figures that supplement the event emulation strategy comparison presented in Section 7.1. The event images produced by all five methods for the three sample RGB inputs and the resulting SNN responses are shown in Figures 30 and 31, respectively. In the latter, each sub-plot depicts the recorded spike trains (left) and the total count of spikes (right) across T_sim at each output neuron. Note that the shared y-axes are indexed by a row-major re-ordering of the neuron indices, which are originally arranged in a two-dimensional grid, where the first and last indices are in the top-left and right-bottom positions, respectively. Figure 30.

Event images produced by each emulation strategy described in Section 7.1. Although the event distributions are fairly similar, a few notable differences include M2 (with blurring) producing less noisy events than M1, the log-based methods M3 and M4 producing more events in darker regions, and M1, M2, and M4 reacting more strongly to reflective surfaces.

Figure 31.

SNN responses to input events from each emulation strategy described in Section 7.1 for each sample. The plots show the spike trains (left) and spike counts (right) for each output neuron. The responses largely match the input events shown in Figure 30 (e.g. more spikes in the first few neurons for M1-M4, which correspond to the top regions of the event images). Despite some differences in the input event data, the SNN responses are similar (except for M5), indicating an inherent robustness to variations in events.

Appendix G. Qualitative evaluation visualizations

This appendix contains supplementary figures for the predictability qualitative evaluations.

Appendix H. Results metrics and trajectory plots

This appendix contains quantitative metric results and trajectory plots that are referred to in discussions of the tuning, validation, and testing phases of the simulation experiments in Sections 6.1.2, 6.1.3, and 6.1.4:

• Tuning phase trajectories executed in scenario 1 (parameter sets 1-12): Figure 34

• Real experiment trajectories executed in scenarios R1-R4: Figure 35

• Validation phase metrics in scenarios 4-11 (selected parameter sets): Figure 36

• Testing phase metrics in scenarios 12-31 (selected parameter set): Figure 37

Figure 34.

Trajectories executed in tuning scenario 1: parameter sets 1-12 (discussed in Section 6.1.2). (a) Parameter Set 1. (b) Parameter Set 2. (c) Parameter Set 3. (d) Parameter Set 4. (e) Parameter Set 5. (f) Parameter Set 6. (g) Parameter Set 7. (h) Parameter Set 8. (i) Parameter Set 9. (j) Parameter Set 10. (k) Parameter Set 11. (l) Parameter Set 12.

Figure 35.

Trajectories executed in real robot experiments with SNN feedback in scenarios R1-R4 (discussed in Section 6.2). (a) Scenario R1. (b) Scenario R2. (c) Scenario R3. (d) Scenario R4.

Figure 36.

Quantitative metric results for validation scenarios 4-11: without SNN feedback and with selected parameter sets. (a) Scenario 4. (b) Scenario 5. (c) Scenario 6. (d) Scenario 7. (e) Scenario 8. (f) Scenario 9. (g) Scenario 10. (h) Scenario 11.

Figure 37.

Quantitative metric results without versus with SNN feedback (best parameter set): testing scenarios 12-31. (a) Scenario 12: {T1, Empty, Yellow, Spiky Sphere}. (b) Scenario 13: {T1, Store, Brick, Buckyball}. (c) Scenario 14: {T1, Kitchen, White, Buckyball}. (d) Scenario 15: {T1, Empty, Yellow-Black, Box}. (e) Scenario 16: {T1, Store, Y-B, Rock}. (f) Scenario 17: {T2, Office, Red, Buckyball, High}. (g) Scenario 18: {T2, Office, Red, Rock, Med.}. (h) Scenario 19: {T2, Store, Brick, Box, Med.}. (i) Scenario 20: {T2, Kitchen, Y-B, Box, High}. (j) Scenario 21: {T2, Empty, Red, Spiky Sphere, Med.}. (k) Scenario 22: {T3, Empty, Red}. (l) Scenario 23: {T3, Empty, Y-B}. (m) Scenario 24: {T3, Kitchen, White}. (n) Scenario 25: {T3, Kitchen, Red}. (o) Scenario 26: {T3, Store, Red}. (p) Scenario 27: {T4, Store, Y-B, Buckyball, Low}. (q) Scenario 28: {T4, Store, Red, Buckyball, High}. (r) Scenario 29: {T4, Empty, Y-B, Spiky Sphere, Med.}. (s) Scenario 30: {T4, Empty, Y-B, Box, Low}. (t) Scenario 31: {T4, Empty, Brick, Buckyball, Low}.

References

Aguilar

Casaliglla

Pólit

(2017) Obstacle avoidance based-visual navigation for micro aerial vehicles. Electronics 6(1): 10.

Arakawa

Shiba

(2020) Exploration of reinforcement learning for event camera using car-like robots. arXiv preprint arXiv:2004.00801.

Bečanović

Bredenfeld

Plöger

(2002a) Reactive robot control using optical analog VLSI sensors. Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292) 2: 1223–1228.

Bečanović

Indiveri

Kobialka

, et al. (2002b) Silicon retina sensing guided by Omni-directional vision. In: Proc. Ninth IEEE Conf. on Mechatronics and Machine Vision in Practice (M2VIP), pp. 10–12. IEEE.

Beckert

Pereira

Althoff

(2017) Online verification of multiple safety criteria for a robot trajectory. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 6454–6461. IEEE.

Bing

Meschede

Huang

, et al. (2018a) End to end learning of spiking neural network based on r-stdp for a lane keeping vehicle. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 4725–4732. IEEE.

Bing

Meschede

Röhrbein

, et al. (2018b) A survey of robotics control based on learning-inspired spiking neural networks. Frontiers in Neurorobotics 12: 35.

Blouw

Eliasmith

(2020) Event-driven signal processing with neuromorphic computing systems. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8534–8538. IEEE.

Borenstein

Koren

(1991) The vector field histogram-fast obstacle avoidance for mobile robots. IEEE Transactions on Robotics and Automation 7(3): 278–288.

10.

Bouvier

Valentian

Mesquida

, et al. (2019) Spiking neural networks hardware implementations and challenges: a survey. ACM Journal on Emerging Technologies in Computing Systems 15(2): 1–35.

11.

Brandli

Berner

Yang

, et al. (2014) A 240 × 180 130 db 3us latency global shutter spatiotemporal vision sensor. IEEE Journal of Solid-State Circuits 49(10): 2333–2341.

12.

Brock

Khatib

(2002) Elastic strips: a framework for motion generation in human environments. The International Journal of Robotics Research 21(12): 1031–1052.

13.

Ceolini

Frenkel

Shrestha

, et al. (2020) Hand-gesture recognition based on emg and event-based camera sensor fusion: a benchmark in neuromorphic computing. Frontiers in Neuroscience 14: 637.

14.

Chen

Liu

Goel

, et al. (2019) Fast retinomorphic event-driven representations for video gameplay and action recognition. IEEE Transactions on Computational Imaging 6: 276–290.

15.

Chen

Cao

Conradt

, et al. (2020a) Event-based neuromorphic vision for autonomous driving: a paradigm shift for bio-inspired visual sensing and perception. IEEE Signal Processing Magazine 37(4): 34–49.

16.

Chen

Hwu

Kashyap

, et al. (2020b) Neurorobots as a means toward neuroethology and explainable AI. Frontiers in Neurorobotics 14: 570308.

17.

Chiriatti

Palmieri

Scoccia

, et al. (2021) Adaptive obstacle avoidance for a class of collaborative robots. Machines 9(6): 113.

18.

Davies

Wild

Orchard

, et al. (2021) Advancing neuromorphic computing with Loihi: a survey of results and outlook. Proceedings of the IEEE 109(5): 911–934.

19.

Diehl

Cook

(2015) Unsupervised learning of digit recognition using spike-timing-dependent plasticity. Frontiers in Computational Neuroscience 9: 99.

20.

Dietsche

Cioffi

Hidalgo-Carrió

, et al. (2021) Powerline tracking with event cameras. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6990–6997. IEEE.

21.

Drubach

(2000) The Brain Explained. Upper Saddle River, NJ: Pearson.

22.

Dubeau

Garon

Debaque

, et al. (2020) Rgb-de: event camera calibration for fast 6-dof object tracking. In: 2020 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–135. IEEE.

23.

Dumesnil

Beaulieu

Boukadoum

(2016) Robotic implementation of classical and operant conditioning as a single stdp learning process. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 5241–5247. IEEE.

24.

Dupeyroux

Hagenaars

Paredes-Vallés

, et al. (2021) Neuromorphic control for optic-flow-based landing of mavs using the Loihi processor. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 96–102. IEEE.

25.

D’Silva

Miikkulainen

(2009) Learning dynamic obstacle avoidance for a robot arm using neuroevolution. Neural Processing Letters 30(1): 59–69.

26.

Escobedo

Strong

West

, et al. (2021) Contact anticipation for physical human–robot interaction with robotic manipulators using onboard proximity sensors. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7255–7262. IEEE.

27.

Fairhall

Lewen

Bialek

, et al. (2001) Efficiency and ambiguity in an adaptive neural code. Nature 412(6849): 787–792.

28.

Falanga

Kleber

Scaramuzza

(2020) Dynamic obstacle avoidance for quadrotors with event cameras. Science Robotics 5(40): eaaz9712.

29.

Falotico

Vannucci

Ambrosano

, et al. (2017) Connecting artificial brains to robots in a comprehensive simulation framework: the neurorobotics platform. Frontiers in Neurorobotics 11: 2.

30.

Feng

Zou

, et al. (2020) An obstacle avoidance method for autonomous vehicle in straight road based on expanded circle. In: 2020 Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC), pp. 43–46. IEEE.

31.

Fox

Burgard

Thrun

(1997) The dynamic window approach to collision avoidance. IEEE Robotics and Automation Magazine 4(1): 23–33.

32.

Furber

(2016) Large-scale neuromorphic computing systems. Journal of Neural Engineering 13(5): 051001.

33.

Gallego

Delbruck

Orchard

, et al. (2022) Event-based vision: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 44(01): 154–180. DOI: 10.1109/TPAMI.2020.3008413.

34.

Garain

Basu

Giampaolo

, et al. (2021) Detection of covid-19 from CT scan images: a spiking neural network-based approach. Neural Computing & Applications 33(19): 12591–12604.

35.

García

Camilleri

Liu

, et al. (2016) Pydvs: an extensible, real-time dynamic vision sensor emulator using off-the-shelf hardware. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–7. IEEE.

36.

Gerstner

(1995) Time structure of the activity in neural network models. Physical Review 51(1): 738–758.

37.

Göltz

Kriener

Baumbach

, et al. (2021) Fast and energy-efficient neuromorphic deep learning with first-spike times. Nature Machine Intelligence 3(9): 823–835.

38.

Goodfellow

Bengio

Courville

(2016) Deep Learning. MIT press.

39.

Hazan

Saunders

Khan

, et al. (2018) Bindsnet: a machine learning-oriented spiking neural networks library in python. Frontiers in Neuroinformatics 12.

40.

Heeger

(2000) Poisson model of spike generation. Handout, University of Standford 5(1-13): 76.

41.

Hodgkin

Huxley

(1952) A quantitative description of membrane current and its application to conduction and excitation in nerve. The Journal of physiology 117(4): 500–544.

42.

Liu

Delbruck

(2021) v2e: From video frames to realistic DVS events. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1312–1321. IEEE.

43.

Hua

Nan

Lian

(2019) Small obstacle avoidance based on RGB-d semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. IEEE.

44.

Ijspeert

(2008) Central pattern generators for locomotion control in animals and robots: a review. Neural Networks 21(4): 642–653.

45.

Ijspeert

Nakanishi

Hoffmann

, et al. (2013) Dynamical movement primitives: learning attractor models for motor behaviors. Neural Computation 25(2): 328–373.

46.

Indiveri

(2021) Introducing ‘neuromorphic computing and engineering. Neuromorphic Computing and Engineering 1(1): 010401.

47.

ISO 10218-1:2011 (2011) Robots and Robotic Devices - Safety Requirements for Industrial Robots - Part 1: Robots. International Organization for Standardization.

48.

Izhikevich

(2003) Simple model of spiking neurons. IEEE Transactions on Neural Networks 14(6): 1569–1572.

49.

Jang

Simeone

Gardner

, et al. (2019) An introduction to probabilistic spiking neural networks: probabilistic models, learning rules, and applications. IEEE Signal Processing Magazine 36(6): 64–77.

50.

Joubert

Marcireau

Ralph

, et al. (2021) Event camera simulator improvements via characterized parameters. Frontiers in Neuroscience 15.

51.

Khatib

(1986) Real-time obstacle avoidance for manipulators and mobile robots. In: Autonomous Robot Vehicles, pp. 396–404. Springer.

52.

Kim

Park

, et al. (2020) Spiking-yolo: spiking neural network for energy-efficient object detection. Proceedings of the AAAI Conference on Artificial Intelligence 34: 11270–11277.

53.

Korteling

van de Boer-Visschedijk

Blankendaal

, et al. (2021) Human-versus artificial intelligence. Frontiers in artificial intelligence 4: 622364.

54.

Lee

Sarwar

Panda

, et al. (2020) Enabling spike-based backpropagation for training deep neural network architectures. Frontiers in Neuroscience 14.

55.

Lee

Zhou

(2021) Deep learning-based monocular obstacle avoidance for unmanned aerial vehicle navigation in tree plantations. Journal of Intelligent and Robotic Systems 101(1): 5–18.

56.

Lichtsteiner

Posch

Delbruck

(2008) A 128$\times$128 120 dB 15 $\mu$s latency asynchronous temporal contrast vision sensor. IEEE Journal of Solid-State Circuits 43(2): 566–576.

57.

Liu

Zhao

Chen

, et al. (2021) Sstdp: supervised spike timing dependent plasticity for efficient spiking neural network training. Frontiers in Neuroscience 15: 756876.

58.

Lobov

Mikhaylov

Shamshin

, et al. (2020) Spatial properties of stdp in a self-learning spiking neural network enable controlling a mobile robot. Frontiers in Neuroscience 14: 88.

59.

Maass

(1997) Networks of spiking neurons: the third generation of neural network models. Neural Networks 10(9): 1659–1671.

60.

Mahowald

(1994) The silicon retina. In: An Analog VLSI System for Stereoscopic Vision, pp. 4–65. Springer.

61.

Maro

Ieng

Benosman

(2020) Event-based gesture recognition with dynamic background suppression using smartphone computational capabilities. Frontiers in Neuroscience 14: 275.

62.

Martins

Braga

Ramos

ACB

, et al. (2018) A computer vision based algorithm for obstacle avoidance. In: Information Technology-New Generations, pp. 569–575. Springer.

63.

Mead

(1990) Neuromorphic electronic systems. Proceedings of the IEEE 78(10): 1629–1636.

64.

Michaelis

Lehr

Tetzlaff

(2020) Robust trajectory generation for robotic control on the neuromorphic research chip loihi. Frontiers in Neurorobotics 14: 589532.

65.

Milde

Bertrand

Benosmanz

, et al. (2015) Bioinspired event-driven collision avoidance algorithm based on optic flow. In: 2015 International Conference on Event-Based Control, Communication, and Signal Processing (EBCCSP), pp. 1–7. IEEE.

66.

Milde

Blum

Dietmüller

, et al. (2017) Obstacle avoidance and target acquisition for robot navigation using a mixed signal analog/digital neuromorphic processing system. Frontiers in Neurorobotics 11: 28.

67.

Minguez

Lamiraux

Laumond

(2016) Motion planning and obstacle avoidance. In: Springer Handbook of Robotics, pp. 1177–1202. Springer.

68.

Mirsadeghi

Shalchian

Kheradpisheh

, et al. (2021) Stidi-bp: spike time displacement based error backpropagation in multilayer spiking neural networks. Neurocomputing 427: 131–140.

69.

Mronga

Knobloch

de Gea Fernández

, et al. (2020) A constraint-based approach for human–robot collision avoidance. Advanced Robotics 34(5): 265–281.

70.

Neil

Pfeiffer

Liu

(2016) Learning to be efficient: algorithms for training low-latency, low-compute deep spiking neural networks. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, pp. 293–298.

71.

Park

Hoffmann

Pastor

, et al. (2008) Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields. In: Humanoids 2008-8th IEEE-RAS International Conference on Humanoid Robots, pp. 91–98. IEEE.

72.

Pfeiffer

Pfeil

(2018) Deep learning with spiking neurons: opportunities and challenges. Frontiers in Neuroscience 12.

73.

Ponghiran

Srinivasan

Roy

(2019) Reinforcement learning with low-complexity liquid state machines. Frontiers in Neuroscience 13: 883.

74.

Posch

Matolin

Wohlgenannt

(2010) A qvga 143 DB dynamic range frame-free pwm image sensor with lossless pixel-level video compression and time-domain CDS. IEEE Journal of Solid-State Circuits 46(1): 259–275.

75.

Posch

Serrano-Gotarredona

Linares-Barranco

, et al. (2014) Retinomorphic event-based vision sensors: bioinspired cameras with spiking output. Proceedings of the IEEE 102(10): 1470–1484.

76.

Rajendran

Sebastian

Schmuker

, et al. (2019) Low-power neuromorphic hardware for signal processing applications: a review of architectural and system-level design approaches. IEEE Signal Processing Magazine 36(6): 97–110.

77.

Rebecq

Gehrig

Scaramuzza

(2018) Esim: an open event camera simulator. In: Conference on Robot Learning, pp. 969–982. PMLR.

78.

Rebecq

Ranftl

Koltun

, et al. (2019) High speed and high dynamic range video with an event camera. IEEE Transactions on Pattern Analysis and Machine Intelligence 43(6): 1964–1980.

79.

Risi

Aimar

Donati

, et al. (2020) A spike-based neuromorphic architecture of stereo vision. Frontiers in Neurorobotics 14.

80.

Roy

Jaiswal

Panda

(2019) Towards spike-based machine intelligence with neuromorphic computing. Nature 575(7784): 607–617.

81.

Safeea

Neto

Bearee

(2019) On-line collision avoidance for collaborative robot manipulators by adjusting off-line generated paths: an industrial use case. Robotics and Autonomous Systems 119: 278–288.

82.

Salarpour

Khotanlou

(2019) Direction-based similarity measure to trajectory clustering. IET Signal Processing 13(1): 70–76.

83.

Salvatore

Mian

Abidi

, et al. (2020) A neuro-inspired approach to intelligent collision avoidance and navigation. In: 2020 AIAA/IEEE 39th Digital Avionics Systems Conference (DASC), pp. 1–9. IEEE.

84.

Sanket

Parameshwara

Singh

, et al. (2020) Evdodgenet: deep dynamic obstacle dodging with event cameras. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 10651–10657. IEEE.

85.

Schaub

Baumgartner

Burschka

(2016) Reactive obstacle avoidance for highly maneuverable vehicles based on a two-stage optical flow clustering. IEEE Transactions on Intelligent Transportation Systems 18(8): 2137–2152.

86.

Scoccia

Palmieri

Palpacelli

, et al. (2021) A collision avoidance strategy for redundant manipulators in dynamically variable environments: on-line perturbations of off-line generated trajectories. Machines 9(2): 30.

87.

Serre

(2019) Deep learning: the good, the bad, and the ugly. Annual review of vision science 5(1): 399–426.

88.

Song

Chang

Chen

(2019) 3d vision for object grasp and obstacle avoidance of a collaborative robot. In: 2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), pp. 254–258. IEEE.

89.

Sun

Cioffi

De Visser

, et al. (2021) Autonomous quadrotor flight despite rotor failure with onboard vision sensors: frames vs. events. IEEE Robotics and Automation Letters 6(2): 580–587.

90.

Sutton

Barto

Others (1998) Introduction to Reinforcement Learning. MIT press Cambridge, Vol. 135.

91.

Taunyazov

Sng

See

, et al. (2020) Event-driven visual-tactile sensing and learning for robots. arXiv preprint arXiv:2009.07083 .

92.

Tavanaei

Ghodrati

Kheradpisheh

, et al. (2019) Deep learning in spiking neural networks. Neural Networks 111: 47–63.

93.

Thakur

Molin

Cauwenberghs

, et al. (2018) Large-scale neuromorphic spiking array processors: a quest to mimic the brain. Frontiers in Neuroscience 12.

94.

Tuckwell

Wan

(2005) Time to first spike in stochastic Hodgkin–Huxley systems. Physica A: Statistical Mechanics and Its Applications 351(2-4): 427–438.

95.

Tulbure

Khatib

(2020) Closing the loop: real-time perception and control for robust collision avoidance with occluded obstacles. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5700–5707. IEEE.

96.

Van Der Smagt

Arbib

Metta

(2016) Neurorobotics: from vision to action. In: Springer Handbook of Robotics. Cham: Springer, 2069–2094.

97.

Vitale

Renner

Nauer

, et al. (2021) Event-driven vision and control for uavs on a neuromorphic chip. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 103–109. IEEE.

98.

Wicaksono

(2008) Learning from Nature: Biologically-Inspired Sensors. TU Delft, Delft University of Technology.

99.

Wunderlich

Kungl

Müller

, et al. (2019) Demonstrating advantages of neuromorphic computation: a pilot study. Frontiers in Neuroscience 13: 260.

100.

Yasin

Mohamed

Haghbayan

, et al. (2020) Night vision obstacle detection and avoidance based on bio-inspired vision sensors. In: 2020 IEEE Sensors, pp. 1–4. IEEE.

101.

Zahra

Tolu

Navarro-Alarcon

(2021) Differential mapping spiking neural network for sensor-based robot control. Bioinspiration & Biomimetics 16(3): 036008.

102.

Zheng

Deng

, et al. (2020) Going deeper with directly-trained larger spiking neural networks. arXiv preprint arXiv:2011.05280.

103.

Zhou

Wang

, et al. (2021a) A spike learning system for event-driven object recognition. arXiv preprint arXiv:2101.08850.

104.

Zhou

Gallego

Shen

(2021b) Event-based stereo visual odometry. IEEE Transactions on Robotics 37(5): 1433–1450.