Abstract
Patterns of gaze play an important role in video analysis and understanding human behavior. This paper proposes a novel approach to solve the problem of activity recognition from first-person gaze movements. Novel n-gram statistical features are extracted from these movements via a multi scale and temporal representation scheme. A joint classification and segmentation of activities and a spatio-temporal contextual learning approach built upon confidence values from long-range neighborhoods reinforces the novelty and efficiency of the proposed approach. Experimental results show that the proposed method improves by 18% on the current baseline.
Keywords
Get full access to this article
View all access options for this article.
