Abstract
This paper explores an initial attempt to use the Unity ML-Agents toolkit to model the behavior of people evacuating from indoor fires. The virtual environment was created in the Unity game engine and populated with humanoid agents capable of moving autonomously within the scene. Each agent perceives information from the rendered environment, such as surfaces, directions, and line-of-sight depth and uses it to navigate toward the nearest exit. Agents were trained through reinforcement learning, using the Proximal Policy Optimization (PPO) algorithm to balance rewards and penalties for their actions. We tested five different reward schemes in single-agent simulations to observe how these affect navigation behavior. Among them, the version referred to as
Keywords
Get full access to this article
View all access options for this article.
