Sage Journals: Discover world-class research

Abstract

Patterns of gaze play an important role in video analysis and understanding human behavior. This paper proposes a novel approach to solve the problem of activity recognition from first-person gaze movements. Novel n-gram statistical features are extracted from these movements via a multi scale and temporal representation scheme. A joint classification and segmentation of activities and a spatio-temporal contextual learning approach built upon confidence values from long-range neighborhoods reinforces the novelty and efficiency of the proposed approach. Experimental results show that the proposed method improves by 18% on the current baseline.

Keywords

Activity recognition first-person (gaze) vision multi scale and temporal features contextual learning SVM

Get full access to this article

View all access options for this article.

References

Aggarwal

and Ryoo

, Human activity analysis: A review, ACM Computing Surveys 43(16) (2011), 16-33.

Ahmadlou

and Adeli

, Enhanced Probabilistic Neural Network with Local Decision Circles: A Robust Classifier, Integrated Computer-Aided Engineering 17(3) (2010), 197-201.

Betancourt

, Morerio

, Regazzoni

C.S.

and Rauterberg

, The Evolution of First Person Vision Methods: A Survey, IEEE Transactions on Circuits and Systems for Video Technology 25(5) (2015), 744-760.

Bulling

, Ward

, Gellersen

and Troster

, Eye Movement Analysis for Activity Recognition using Electrooculography, IEEE Transactions on Pattern Analysis and Machine Intelligence 33(4) (2011), 741-751.

Bulling

, Weichel

and Gellersen

, Eyecontext: Recognition of High-level Contextual Cues from Human Visual Behaviour, ACM SIGCHI Int. Conference on Human Factors in Computing Systems (CHI), 2013.

Courtemanche

, Admeur

, Dufresne

, Najjar

and Mpondo

, Activity recognition using eye-gaze movements and traditional interactions, Interacting with Computers 23 (2011), 202-213.

Doshi

and Trivedi

M.M.

, On the Roles of Eye Gaze and Head Dynamics in Predicting Driver's Intent to Change Lanes, IEEE Trans. on Intelligent Transportation Systems (ITS) 10 (2009), 453-462.

Fan

R.-E.

, Chang

K.-W.

, Hsieh

C.-J.

, Wang

X.-R.

and Lin

C.-J.

, Liblinear: A Library for Large Linear Classification, Journal of Machine Learning Research (JMLR) 9 (2008), 1871-1874.

Fathi

, Farhadi

and Rehg

J.M.

, Understanding egocentric activities, IEEE International Conference on Computer Vision (ICCV) (2011), 407-414.

10.

Fathi

, Li

and Rehg

J.M.

, Learning to Recognize Daily Actions using Gaze, European Conference on Computer Vision (ECCV) 7572 (2012), 314-327.

11.

Fischler

M.A.

and Bolles

R.C.

, Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography, Communications of the ACM 24 (1981), 381-395.

12.

Hirschauer

, Adeli

and Buford

, Computer-Aided Diagnosis of Parkinson's Disease using an Enhanced Probabilistic Neural Network, Journal of Medical Systems 39(179) (2015).

13.

Huynh

, Fritz

and Schiele

, Discovery of Activity Patterns using Topic Models, Int. Conference on Ubiquitous Computing (UbiComp), 2008, 10-19.

14.

Keat

F.T.

, Ranganath

and Venkatesh

Y.V.

, Eye Gaze Based Reading Detection, Proc. IEEE Conf. Convergent Technologies for the Asia-Pacific Region 2 (2003), 825-828.

15.

Kitani

K.M.

, Okabe

, Sato

and Sugimoto

, Fast Unsupervised Ego-Action Learning for First-Person Sports Videos, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011), 3241-3248.

16.

Lafferty

J.D.

, McCallum

and Pereira

F.C.N.

, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, Int. Conference on Machine Learning (ICML) (2001), 282-289.

17.

Laptev

and Lindeberg

, Space-Time Interest Points, Int. Conference on Computer Vision (ICCV) 1 (2003), 432-439.

18.

Le Coat

, Pissaloux

, Bonnin

, Tissot

, Garié

Th.

and Durbin

, Parallel Algorithm for Very Fast Velocity Field Estimation, IEEE Int. Conference on Image Procesing (1997), II-179-182.

19.

Lee

, Mingui

, Swathi

and Minho

, Action-perception cycle learning for incremental emotion recognition in a movie clip using 3D fuzzy GIST based on visual and EEG signals, Integrated Computer-Aided Engineering 21(3) (2014), 295-310.

20.

Leslie

, Eskin

and Noble

W.S.

, The Spectrum Kernel: A String Kernel for SVM Protein Classification, Pacific Symposium on Biocomputing 7 (2002), 566-575.

21.

, Xu

, Goodman

E.D.

, Xu

and Wu

, Integrating a Statistical Background - Foreground Extraction Algorithm and SVM Classifier for Pedestrian Detection and Tracking, Integrated Computer-Aided Engineering 20(3) (2013), 201-216.

22.

Lowe

D.G.

, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision 60(2) (2004), 91-110.

23.

Martinez

, Carbone

and Pissaloux

, Combining First-Person and Third-Person Gaze for Attention Recognition, 10^th IEEE Conf. on Automatic Face and Gesture Recognition, Shanghai, April 2013.

24.

Mayol

W.W.

and Murray

D.W.

, Wearable hand activity recognition for event summarization, Int. Symposium on Wearable Computers (ISWC), 2005, 122-129.

25.

Meraoumia

, Chitroub

and Bouridane

, 2D and 3D Palmprint Iinformation, PCA and HMM for an Improved Person Recognition Performance, Integrated Computer-Aided Engineering 20(3) (2013), 303-319.

26.

Mesquita

R.G.

, A New Thresholding Algorithm for Document Images based on the Perception of Objects by Distance, Integrated Computer-Aided Engineering 21(2) (2014), 133-146.

27.

Nachar

, Inaty

, Bonnin

and Alayli

, Towards an automatic image co-registration technique using edge dominant corners primitive, Integrated Computer-Aided Engineering 22(1) (2015), 1-19.

28.

Ogaki

, Kitani

K.M.

, Sugano

and Sato

, Coupling Eye-Motion and Ego-Motion Features for First-Person Activity Recognition, IEEE CVPR (Computer Vision and Pattern Recognition) Workshop on Egocentric Vision (ECV), 2012, 1-7.

29.

Pissaloux

, A Systolic Architecture For Stereo Image Matching, IEEE Asia-Pacific Conference on Circuits and Systems (APCCS), 2002, Bali, Indonesia.

30.

Salton

and McGill

M.J.

, Introduction to modern information retrieval, McGraw-Hill, Inc. New York, N.Y., 1986.

31.

Schiele

, Oliver

, Jebara

and Pentland

, An Interactive Computer Vision System DyPERS: Dynamic Personal Enhanced Reality System, ICVS'99 (Int. Conference on Vision Systems), January 1999, 51-65.

32.

Spriggs

E.H.

, De la Torre

and Hebert

, Temporal Segmentation and Activity Classification from First-Person Sensing, IEEE Workshop on Egocentric Vision, CVPR 2009, 17-24.

33.

Starner

, Weaver

and Pentland

, Real-time American Sign Language Recognition using Desk and Wearable Computer based Video, IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (1998), 1371-1375.

34.

and Bai

, Auto-Context and its Application to High-Level Vision Tasks and 3D Brain Image Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (2010), 1744-1757.

35.

Villar

, Chira

, Sedano

, Gonzale

and Trejo

J.M.

, A hybrid intelligent recognition system for the early detection of strokes, Integrated Computer-Aided Engineering 22(3) (2015), 251-227.

36.

, Guo

and Philip Chen

C.L.

, An Adaptive Regular ization Method for Sparse Representation, Integrated Computer-Aided Engineering 21(1) (2014), 91-100.

Towards activity recognition from eye-movements using contextual temporal learning

Abstract

Keywords

Get full access to this article

References