Sage Journals: Discover world-class research

Abstract

In a career development environment that is very dynamic and complicated, standard career path planning approaches can’t really deal with the uncertainty of the environment in multi-stage decision-making because they rely on static models and expert expertise. This research suggests a reinforcement learning (RL)-based career path planning model that uses a Markov decision process (MDP) to simulate multi-stage decision optimisation and deep Q network (DQN) to perform adaptive iterative optimisation of strategies. A cumulative reward normalised score comparison experiment and a learning rate sensitivity analysis experiment show that the model works best at a medium learning rate (0.01) and that it reaches its highest cumulative reward score during the training period of 25,000 to 35,000 steps. In conclusion, the RL algorithm suggested in this research provides both theoretical and technical support for making an effective career planning system.

Keywords

career path planning RL multi-stage decision-making MDP DQN

Get full access to this article

View all access options for this article.

References

Abdulazeez

Askar

. Offloading mechanisms based on reinforcement learning and deep learning algorithms in the fog computing environment. IEEE Access 2023; 11: 12555–12586.

Ullah

Zhang

, et al. RL and ANN based modular path planning controller for resource-constrained robots in the indoor complex dynamic environment. IEEE Access 2018; 6: 74557–74568.

Nguyen

Nahavandi

. Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications. IEEE Trans Cybern 2020; 50: 3826–3839.

Madanchian

Taherdoost

Vincenti

, et al. Transforming leadership practices through artificial intelligence. Procedia Comput Sci 2024; 235: 2101–2111.

Yang

Liu

Zeng

, et al. Learning the optimal policy for balancing short-term and long-term rewards. Adv Neural Inf Process Syst 2024; 37: 36514–36540.

Singh

Kumar

Singh

. Reinforcement learning in robotic applications: a comprehensive survey. Artif Intell Rev 2022; 55: 945–990.

Kebede

Yohannes

Desta

. Mastering the principles of reinforcement learning: techniques, applications, and future prospects. Fusion of multidisciplinary research. Int J 2023; 4: 483–497.

Dogru

Xie

Prakash

, et al. Reinforcement learning in process industries: review and perspective. IEEE/CAA J Autom Sinica 2024; 11: 283–300.

Chang

H-H

Liu

. Deep echo state Q-network (DEQN) and its application in dynamic spectrum sharing for 5G and beyond. IEEE Trans Neural Netw Learn Syst 2020; 33: 929–939.

10.

Peng

Chen

Zhang

, et al. Proximal evolutionary strategy: improving deep reinforcement learning through evolutionary policy optimization. Memet Comput 2024; 16: 445–466.

11.

Thomas

, et al. Mopo: model-based offline policy optimization. Adv Neural Inf Process Syst 2020; 33: 14129–14142.

12.

Lou

, et al. Optimal dispatch for flexible uncertainty sets in multi-energy systems: an IGDT based two-stage decision framework. CSEE Journal of Power and Energy Systems 2021; 9: 2374–2385.

13.

Khoa

Nguyen

H-T

Anh

DBH

, et al. Impact of Artificial Intelligence's part in supply chain planning and decision making optimization. International Journal of Multidisciplinary Research and Growth Evaluation 2024; 5: 837–856.

14.

Patterson

Roberts

Hanson

, et al. 2018 Ottawa consensus statement: selection and recruitment to the healthcare professions. Med Teach 2018; 40: 1091–1101.

15.

Hajiagha

SHR

Heidary-Dahooie

Meidutė-Kavaliauskienė

, et al. A new dynamic multi-attribute decision making method based on Markov chain and linear assignment. Ann Oper Res 2022; 315: 159–191.

16.

El Raoui

Oudani

Alaoui

AEH

. Coupling soft computing, simulation and optimization in supply chain applications: review and taxonomy. IEEE Access 2020; 8: 31710–31732.

17.

Carlsson

. Decision analytics—key to digitalisation. Inf Sci 2018; 460: 424–438.

18.

Nosratabadi

Zahed

Ponkratov

, et al. Artificial Intelligence models and employee lifecycle management: a systematic literature review. Organizacija 2022; 55: 181–198.

19.

Mei

Zhang

, et al. Learn to optimise for job shop scheduling: a survey with comparison between genetic programming and reinforcement learning. Artif Intell Rev 2025; 58: 1–53.

20.

Dai

Chen

Xiao

, et al. Prediction of super-large diameter shield attitude based on LSTM-Transformer. Sci Rep 2025; 15: 1–22.

21.

Vettoruzzo

Bouguelia

M-R

Vanschoren

, et al. Advances and challenges in meta-learning: a technical review. IEEE Trans Pattern Anal Mach Intell 2024; 46: 4763–4779.

22.

Massaoudi

Abu-Rub

Ghrayeb

. Navigating the landscape of deep reinforcement learning for power system stability control: a review. IEEE Access 2023; 11: 134298–134317.

23.

Okolie

Nwajiuba

Binuomote

, et al. Career training with mentoring programs in higher education: facilitating career development and employability of graduates. Educ + Train 2020; 62: 214–234.

24.

Lake

Highhouse

Shrift

. Validation of the job-hopping motives scale. J Career Assess 2018; 26: 531–548.

25.

Greco

Ishizaka

Tasiou

, et al. On the methodological framework of composite indices: a review of the issues of weighting, aggregation, and robustness. Soc Indic Res 2019; 141: 61–94.

26.

S-M

Zhang

X-T

Z-J

. Enhanced minimum-cost consensus: focusing on overadjustment and flexible consensus cost. Inf Fusion 2023; 89: 336–354.

27.

Ben-Michael

Feller

Rothstein

. The augmented synthetic control method. J Am Stat Assoc 2021; 116: 1789–1803.

28.

Conroy

Roumpi

Delery

, et al. Pay volatility and employee turnover in the trucking industry. J Manag 2022; 48: 605–629.

29.

Chen

Wang

Song

, et al. Stabilization approaches for reinforcement learning-based end-to-end autonomous driving. IEEE Trans Veh Technol 2020; 69: 4740–4750.

30.

Zheng

Chand

, et al. Outlook on human-centric manufacturing towards Industry 5.0. J Manuf Syst 2022; 62: 612–627.

31.

Ranftl

Lasinger

Hafner

, et al. Towards robust monocular depth estimation: mixing datasets for zero-shot cross-dataset transfer. IEEE Trans Pattern Anal Mach Intell 2020; 44: 1623–1637.

32.

Parmezan

ARS

Souza

Batista

. Evaluation of statistical and machine learning models for time series prediction: identifying the state-of-the-art and the best conditions for the use of each model. Inf Sci 2019; 484: 302–337.

33.

Zhang

Zhu

, et al. PerMl-Fed: enabling personalized multi-level federated learning within heterogenous IoT environments for activity recognition. Clust Comput 2024; 27: 6425–6440.

Reinforcement learning-based multi-stage decision optimisation in career path planning

Abstract

Keywords

Get full access to this article

References