Abstract
Against the backdrop of the booming development of autonomous driving technology, to address the problems of insufficient accuracy in long-term prediction and incomplete interaction modeling for pedestrian trajectory prediction from a first-person perspective, this paper proposes the Fusion-Hierarchical Graph Trajectory Network (FHGT-Net). This network first utilizes a Multi-Layer Perceptron (MLP) to generate prior future trajectories, which are then concatenated with historical trajectories to form fused features. Subsequently, an encoder-decoder structure is utilized. The encoder, composed of local sub-graphs and a global graph, extracts features from the fused data. The local sub-graphs process individual pedestrian trajectories through multiple MLP layers, layer normalization, and Rectified Linear Unit (ReLU) activation, while the global graph, leveraging a multi-head attention mechanism, models interactions among pedestrians. Finally, the decoder, based on a Long Short-Term Memory (LSTM) network, decodes the encoder output to iteratively derive the most likely future trajectories for each pedestrian. Experimental results show that, in the PIE (Pedestrian In-car Environment) dataset, compared with the current advanced level, this model reduces the Average Displacement Error (ADE) by 9.9% and the Final Displacement Error (FDE) by 7.5%, significantly improving the accuracy. In addition, this model also performs well in real-vehicle experiments. The research results provide a reliable solution for pedestrian trajectory prediction from a first-person perspective, especially for predictions in multi-pedestrian interaction scenarios, which is of great significance.
Understanding how people move on the road is critical for self-driving cars to make safe decisions. This study introduces a new artificial intelligence model called FHGT-Net, designed to predict the walking paths of pedestrians as seen from a car's front camera. Traditional models often struggle to make accurate long-term predictions and to correctly understand how multiple pedestrians influence each other's movements. FHGT-Net solves these problems through a new fusion-hierarchical design. It first combines the car's observations of past pedestrian movements with early guesses of where they might go next. Then, it uses a graph network to model both individual movement patterns and group interactions among pedestrians. Finally, a recurrent neural network predicts where each pedestrian is most likely to move in the next few seconds. When tested on a public dataset called PIE (Pedestrian In-car Environment), FHGT-Net outperformed existing methods, reducing prediction errors by about 10%. It also showed strong performance in real driving tests. In simple terms, this model helps self-driving systems “read” pedestrian behavior more accurately and respond safely in busy environments. The findings bring autonomous driving one step closer to understanding human motion in real-world traffic.
Keywords
Get full access to this article
View all access options for this article.
