Abstract
Group Activity Recognition (GAR) is the task of recognizing an overall activity in a multi-individual scene. Most of the existing methods have achieved significant progress by incorporating the attributes and relations between individuals. However, these methods still suffer from the ability to automatically detect, recognize, and infer potential connections in group behavior. To address the issue, inspired by the role of latent spatial position present in video frames, we propose a novel method for learning graph structures by incorporating the distances between individuals. Specifically, we design a graph reasoning module based on Graph Convolutional Networks (GCNs) to learn the hierarchical relationship between individual behaviors and group intentions. To evaluate the feasibility and effectiveness of our proposed model, we conduct experiments on publicly available datasets. Through the experimental results, we validate the effectiveness of our approach, demonstrating its ability to accurately analyze and interpret group behavior.
Get full access to this article
View all access options for this article.
