Abstract
In a career development environment that is very dynamic and complicated, standard career path planning approaches can’t really deal with the uncertainty of the environment in multi-stage decision-making because they rely on static models and expert expertise. This research suggests a reinforcement learning (RL)-based career path planning model that uses a Markov decision process (MDP) to simulate multi-stage decision optimisation and deep Q network (DQN) to perform adaptive iterative optimisation of strategies. A cumulative reward normalised score comparison experiment and a learning rate sensitivity analysis experiment show that the model works best at a medium learning rate (0.01) and that it reaches its highest cumulative reward score during the training period of 25,000 to 35,000 steps. In conclusion, the RL algorithm suggested in this research provides both theoretical and technical support for making an effective career planning system.
Get full access to this article
View all access options for this article.
