Abstract
Though deep learning networks have proven ability to perform video analytics in complex environments, there is an increased attention towards the development of compact networks which would facilitate edge processing and the result of which have yielded high performance compressed deep learning networks such as, MobileNet, PWCNet and BindsNet. In the work proposed herein, a dual network configuration is used for human action recognition, wherein, the MobileNet captures the spatial appearance of the action sequences and the PWCNet is used to extract the motion vectors. A novel Spiking Neural Network (SNN) based configuration is used as the classifier and the SNN implementation is based on BindsNet. The proposed configuration is experimentally validated on challenging datasets, viz., HMDB51 and UCF101. The experimental results demonstrate that the proposed work is superior to the state-of-the-art techniques and comparable in few cases.
Get full access to this article
View all access options for this article.
