Abstract
Pedestrian trajectory prediction plays a pivotal role in real-world applications such as autonomous driving, unmanned delivery, and intelligent surveillance. However, existing deep learning approaches still face critical challenges, including mode collapse and the generation of unrealistic trajectories in complex environments. To address these limitations, we propose Phase Fusion Network (PFNet), a novel trajectory prediction framework designed to enhance prediction accuracy in intricate digital media scenarios. PFNet introduces an innovative Graph Encoder (GE) that incorporates a probabilistic modeling strategy to better capture spatial features and pedestrian interactions. To mitigate mode collapse, a common limitation in GAN-based methods, PFNet employs a dual-discriminator mechanism that improves both the realism and diversity of predicted trajectories. Additionally, PFNet adopts a two-phase architecture, where the generation phase strengthens spatial representation and the prediction phase refines temporal consistency. Extensive experiments on standard benchmarks, including ETH, UCY, and the Stanford Drone datasets, demonstrate that PFNet consistently outperforms state-of-the-art methods in terms of both Average Displacement Error (ADE) and Final Displacement Error (FDE).
Get full access to this article
View all access options for this article.
