Deep Deterministic Policy Gradient for Navigation of Mobile Robots

Abstract

This article describes the use of the Deep Deterministic Policy Gradient network, a deep reinforcement learning algorithm, for mobile robot navigation. The neural network structure has as inputs laser range findings, angular and linear velocities of the robot, and position and orientation of the mobile robot with respect to a goal position. The outputs of the network will be the angular and linear velocities used as control signals for the robot. The experiments demonstrated that deep reinforcement learning’s techniques that uses continuous actions, are efficient for decision-making in a mobile robot. Nevertheless, the design of the reward functions constitutes an important issue in the performance of deep reinforcement learning algorithms. In order to show the performance of the Deep Reinforcement Learning algorithm, we have applied successfully the proposed architecture in simulated environments and in experiments with a real robot.

Keywords

Deep Deterministic Policy Gradient Deep Reinforcement Learning Navigation for Mobile Robots

Get full access to this article

View all access options for this article.

References

Chen

, Seff

, Kornhauser

and Xiao

, Deepdriving: Learning affordance for direct perception in autonomous driving, In Proceedings of the IEEE International Conference on Computer Vision (2015), 2722–2730.

Chen

Y.F.

, Everett

, Liu

and How

J.P.

, Socially aware motion planning with deep reinforcement learning, In Intelligent Robots and Systems (IROS), 2017 IEEE/RSJ International Conference on, pages 1343–1350. IEEE, (2017).

da Silva

R.M.

, Cuadros

M.A.d.S.L.

and Gamarra

D.F.T.

, Comparison of a backstepping and a fuzzy controller for tracking a trajectory with a mobile robot. In International Conference on Intelligent SystemsDesign and Applications, pages 212–221. Springer, (2018).

Dobrevski

and Skocaj

, Map-less goal-driven navigation based on reinforcement learning, In 23rd Computer Vision Winter Workshop, (2018).

Dos Reis

D.H.

, Welfer

, De Souza Leite Cuadros

M.A.

and Gamarra

D.F.T.

, Mobile robot navigation using an objectrecognition software with rgbd images and the yolo algorithm, Applied Artificial Intelligence 33(14) (2019), 1290–1305.

Fairchild

and Harman

T.L.

, ROS Robotics By Example, Packt Publishing Ltd (2016).

, Lillicrap

, Sutskever

and Levine

, Continuous deep q-learning with model-based acceleration, In International Conference on Machine Learning (2016), 2829–2838.

, Holly

, Lillicrap

and Levine

, Deep reinforcement learning for robotic manipulation with asynchronous offpolicy updates, In 2017 IEEE International Conference on Robotics and Automation (ICRA), 3389–3396. IEEE, (2017).

Hausknecht

and Stone

, Deep recurrent q-learning for partially observable mdps, In 2015 AAAI Fall Symposium Series, (2015).

10.

Jesus

J.C.

, Bottega

J.A.

, Cuadros

M.A.

and Gamarra

D.F.

, Deep deterministic policy gradient for navigation of mobile robots in simulated environments. In 2019 19th International Conference on Advanced Robotics (ICAR), 362–367. IEEE, (2019).

11.

Joseph

, Mastering ROS for robotics programming, Packt Publishing Ltd (2015).

12.

Konda

V.R.

and Tsitsiklis

J.N.

, Actor-critic algorithms, In Advances in neural information processing systems, (2000), 1008–1014.

13.

Lillicrap

T.P.

, Hunt

J.J.

, Pritzel

, Heess

, Erez

, Tassa

, Silver

and Wierstra

, Continuous control with deep reinforcement learning, arXiv preprint arXiv:1509.02971, (2015).

14.

Mahmood

A.R.

, Korenkevych

, Vasan

, Ma

and Bergstra

, Benchmarking reinforcement learning algorithms on real-world robots, arXiv preprint arXiv:1809.07731, (2018).

15.

Mirowski

, Pascanu

, Viola

, Soyer

, Ballard

A.J.

, Banino

, Denil

, Goroshin

, Sifre

, Kavukcuoglu

, et al., Learning to navigate in complex environments, arXiv preprint arXiv:1611.03673, (2016).

16.

Mnih

, Kavukcuoglu

, Silver

, Graves

, Antonoglou

, Wierstra

and Riedmiller

, Playing atari with deep reinforcement learning, arXiv preprint arXiv:1312.5602, (2013).

17.

Mnih

, Kavukcuoglu

, Silver

, Rusu

A.A.

, Veness

, Bellemare

M.G.

, Graves

, Riedmiller

, Fidjeland

A.K.

, Ostrovski

, et al., Human-level control through deep reinforcement learning, Nature 518(7540) (2015), 529–533.

18.

Nachum

, Norouzi

, Xu

and Schuurmans

, Trustpcl: An off-policy trust region method for continuous control, arXiv preprint arXiv:1707.01891, (2017).

19.

Paszke

, Gross

, Massa

, Lerer

, Bradbury

, Chanan

, Killeen

, Lin

, Gimelshein

, Antiga

, et al., Pytorch: An imperative style, high-performance deep learning library, In Advances in Neural Information Processing Systems, (2019), 8024–8035.

20.

Pfitscher

, Welfer

, Do Nascimento

E.J.

, Cuadros

M.A.d.S.L.

and Gamarra

D.F.T.

, Article users activitygesture recognition on kinect sensor using convolutional neural networks and fastdtw for controlling movements ofa mobile robot, Inteligencia Artificial 22(63) (2019), 121–134.

21.

Pyo

, Cho

, Jung

and Lim

, Ros robot programming, Seoul, ROBOTIS Co, (2015).

22.

Qiang

, Nanxun

, Huican

and Heng

, Amodel-free mapless navigation method for mobile robot using reinforcement learning, In 2018 Chinese Control And Decision Conference (CCDC), pages 3410–3415. IEEE, (2018).

23.

Richter

, Orosco

R.K.

and Yip

M.C.

, Open-sourced reinforcement learning environments for surgical robotics, arXiv preprint arXiv:1903.02090, (2019).

24.

Schaul

, Quan

, Antonoglou

and Silver

, Prioritized experience replay, arXiv preprint arXiv:1511.05952, (2015).

25.

Schulman

, Moritz

, Levine

, Jordan

and Abbeel

, High-dimensional continuous control using generalized advantage estimation, arXiv preprint arXiv:1506.02438, (2015).

26.

Tai

and Liu

, Towards cognitive exploration through deep reinforcement learning for mobile robots, arXiv preprint arXiv:1610.01733, (2016).

27.

Tai

, Paolo

and Liu

, Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation, In Intelligent Robots and Systems (IROS), 2017 IEEE/RSJ International Conference on, pages 31–36. IEEE, (2017).

28.

Tello Gamarra

D.F.

, Piccinini Legg

, de Souza Leite Cuadros

M.A.

and Santos da Silva

, Sensory integration of a mobile robot using the embedded system odroid-xu4 and ros. In 2019 Latin American Robotics Symposium (LARS), 2019 Brazilian Symposium on Robotics (SBR) and 2019 Workshop on Robotics in Education (WRE), (2019), 198–203.

29.

Uhlenbeck

G.E.

and Ornstein

L.S.

, On the theory of the brownian motion, Physical Review 36(5) (1930), 823.

30.

Van Hasselt

, Guez

and Silver

, Deep reinforcement learning with double q-learning, In Thirtieth AAAI Conference on Artificial Intelligence, (2016).

31.

Zhu

, Mottaghi

, Kolve

, Lim

J.J.

, Gupta

, Fei-Fei

and Farhadi

, Target-driven visual navigation in indoor scenes using deep reinforcement learning, In Robotics and Automation (ICRA), 2017 IEEE International Conference on, pages 3357–3364. IEEE, (2017).