Sage Journals: Discover world-class research

Abstract

Legged robots are expected to exhibit natural and adaptive maneuvers of animals on harsh terrains. A compelling way is reinforcement learning, however, the current concurrent training for gait generation and adaptation to varying environmental disturbance challenge reward shaping and damage learning efficiency. To alleviate the problem, we present the Dc-Gait pipeline, which separates the rough locomotion learning process into two sequential stages: gait generation and imitating adaptation, interconnected through state-based gait constraints based on the generated gait dataset. Firstly, gait generation training is induced by well-designed rewards without external interference, leading to the creation of a gait-specific dataset comprised of a series of state transition pairs. Inspired by adversarial imitation learning, these pairs are then generalized through a discriminator network, which is used to generate state-based imitating reward to constrain gaits during adaptation training. This state-based constraint effectively induces the robot to rapidly converge from a disrupted state back to the original gait, significantly enhancing training efficiency. Extensive experiments for different tasks across various robots demonstrate that the proposed pipeline enables robots to master adaptively gait-constraining movements to overcome challenging terrains (see Supplemental video).

Keywords

deep reinforcement learning gait generation gait adaptation imitation learning legged robots

Get full access to this article

View all access options for this article.

References

Biswal

Mohanty

PK.

Development of quadruped walking robots: a review. Ain Shams Eng J 2021; 12: 2017–2031.

Ramdya

Ijspeert

AJ.

The neuromechanics of animal locomotion: from biology to robotics and back. Sci Robot 2023; 8: eadg0279.

Wilson

DM.

Insect walking. Ann Rev Entomol 1966; 11: 103–122.

Weingarten

Buehler

Groff

, et al. Gait generation and optimization for legged robots. Preprint, Jan 2003. [Online]. https://kodlab.seas.upenn.edu/uploads/Kod/Weingarten03.pdf

Weingarten

Lopes

GAD

Buehler

, et al. Automated gait adaptation for legged robots. In: Proceedings of International Conference on Robotics and Automation, New Orleans, LA, USA, 13–17 May 2004. pp. 2153–2158.

Ijspeert

AJ.

Central pattern generators for locomotion control in animals and robots: a review. Neural Netw 2008; 21(4): 642–653.

Rathod

Pathak

Model predictive control with environment adaptation for legged locomotion. IEEE Access 2021; 9: 145710–145727.

Bellicoso

Jenelten

Gehring

, et al. Dynamic locomotion through online nonlinear motion optimization for quadrupedal robots. IEEE Robot Autom Lett 2018; 3(3): 2261–2268.

Winkler

Bellicoso

Hutter

, et al. Gait and trajectory optimization for legged systems through phase-based end-effector parameterization. IEEE Robot Autom Lett 2018; 3(3): 1560–1567.

10.

Ding

Pandala

Park

. Real-time model predictive control for versatile dynamic motions in quadrupedal robots. In: Proceedings of International Conference on Robotics and Automation, Montreal, Canada, 20–24 May 2019, pp. 8484–8490.

11.

Bledt

Powell

Katz

, et al. MIT Cheetah 3: design and control of a robust, dynamic quadruped robot. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018, pp. 2245–2252.

12.

Fahmi

Barasuol

Esteban

, et al. ViTAL: vision-based terrain-aware locomotion for legged robots. IEEE Trans Robot 2023; 39: 885–904.

13.

Zhao

Gao

Sun

, et al. Terrain classification and adaptive locomotion for a hexapod robot Qingzhui. Front Mech Eng 2021; 16: 271–284.

14.

Lee

Van de Panne

, et al. Learning-based legged locomotion: state of the art and future perspectives. Int J Robot Res 2025; 44(8): 1396–1427.

15.

Zhang

Chen

Meng

, et al. Adaptive gait acquisition through learning dynamic stimulus instinct of bipedal robot. Biomimetics 2024; 9: 310.

16.

Chen

, et al. A hierarchical framework for quadruped robots gait planning based on DDPG. Biomimetics 2023; 8: 382.

17.

Lee

Hwangbo

Wellhausen

, et al. Learning quadrupedal locomotion over challenging terrain. Sci Robot 2020; 5: eabc5986.

18.

Miki

Lee

Hwangbo

, et al. Learning robust perceptive locomotion for quadrupedal robots in the wild. Sci Robot 2022; 7: eabk2822.

19.

Margolis

Agrawal

. Walk these ways: tuning robot control for generalization with multiplicity of behavior. In: Proceedings of the Conference on Robot Learning, Atlanta, GA, USA, 6–9 November 2023, pp. 22–31.

20.

Duan

Malik

Dao

, et al. Sim-to-real learning of footstep-constrained bipedal dynamic walking. In: Proceedings of the IEEE International Conference on Robotics and Automation, Philadelphia, PA, USA, 23–27 May 2022, pp. 10428–10434.

21.

Kumar

Pathak

, et al. RMA: rapid motor adaptation for legged robots. In: Proceedings of the Robotics: Science and Systems XVII, Sydney, Australia, 13–17 July 2021, pp. 11–16.

22.

Nahrendra

IMA

Myung

. DreamWaQ: learning robust quadrupedal locomotion with implicit terrain imagination via deep reinforcement learning. In: Proceedings of International Conference on Robotics and Automation, London, UK, May–June 2023, pp. 5078–5084.

23.

Mun

Kim

, et al. Concurrent training of a control policy and a state estimator for dynamic and robust legged locomotion. IEEE Robot Autom Lett 2022; 7: 4630–4637.

24.

Margolis

Yang

Paigwar

, et al. Rapid locomotion via reinforcement learning. Int J Robot Res 2024; 43(4): 572–587.

25.

Peng

Abbeel

Levine

, et al. DeepMimic: example-guided deep reinforcement learning of physics-based character skills. ACM Trans Graph 2018; 37(4): 143:1–143:15.

26.

Peng

Coumans

Zhang

, et al. Learning agile robotic locomotion skills by imitating animals. In: Proceedings from the Robotics: Science and Systems, Seattle, WA, USA, 12–16 July 2020.

27.

Ermon

Generative adversarial imitation learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016, pp. 4565–4573.

28.

Peng

Abbeel

, et al. AMP: adversarial motion priors for stylized physics-based character control. ACM Trans Graph 2021; 40(4): 1–20.

29.

Peng

Guo

Halper

, et al. ASE: large-scale reusable adversarial skill embeddings for physically simulated characters. ACM Trans Graph 2022; 41(4): 1–17.

30.

Escontrela

Peng

, et al. Adversarial motion priors make good substitutes for complex reward functions. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Kyoto, Japan, 26 December 2022, pp. 8513–8520.

31.

Xin

, et al. Learning robust and agile legged locomotion using adversarial motion priors. IEEE Robot Autom Lett 2023; 8: 4975–4982.

32.

Zhang

Cui

Yan

, et al. Whole-body humanoid robot locomotion with human reference. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Abu Dhabi, UAE, 14–18 October 2024, pp. 11225–11231.

33.

Liu

Jia

. Hopf oscillator-based adaptive locomotion control for a bionic quadruped robot. In: Proceedings of the Annual IEEE International Conference on Mechatronics and Automation, Takamatsu, Japan, 6–9 August 2017, pp. 949–954.

34.

Schulman

Wolski

Dhariwal

, et al. Proximal policy optimization algorithms. arXiv [Preprint] arXiv:1707.06347, 2017.

35.

Goodfellow

Pouget-Abadie

Mirza

, et al. Generative adversarial networks. Commun ACM 2020; 63(11): 139–144.

36.

Rudin

Hoeller

Reist

, et al. Learning to walk in minutes using massively parallel deep reinforcement learning. In: Proceedings of the 5th Conference on Robot Learning (CoRL), Auckland, New Zealand, 14–18 December 2022, pp. 91–100.

37.

Chen

Zhou

Koltun

, et al. Learning by cheating. In: Proceedings of the Conference on Robot Learning (CoRL), Auckland, NZ, USA, 14–18 December 2020, pp. 66–75.

38.

Jiang

Wang

Chen

, et al. MNN: a universal and efficient inference engine. Proc Mach Learn Syst 2020; 2: 1–13.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

Dc-Gait: Efficient locomotion learning by decoupled gait generation and imitating adaptation on challenging terrains

Abstract

Keywords

Get full access to this article

References

Supplementary Material