Abstract
In choral performances, the choral conductor’s gesture recognition is crucial, as it directly affects the overall performance of the choir. The posture recognition of choral conductors focuses on static states. This recognition method is difficult to fully capture spatial and temporal information, and has a slow response speed, which affects recognition accuracy and real-time performance. Based on OpenPose, this article developed an Inception-LSTM-TSN hybrid CNN model for posture recognition of choral commanders. Firstly, the pose extraction network VGG19 in the OpenPose algorithm was replaced with MobileNet-V3, and then a CNN network for two-stream pose recognition was constructed. The Inception network was then introduced to extract temporal information, the LSTM network was used to extract temporal information, and the EfficientNet-B3 optimized TSN was used for segmentation to obtain image information for choral commander pose recognition. Finally, the results of the two branches of the constructed hybrid two-stream convolutional neural network (H2SCNN) model can be fused using the average fusion method to output the choral conductor pose recognition results. The experiment was based on the publicly available dataset ConductorMotion100 and the self-built dataset CCD for choral conductor pose recognition. The results showed that the pose recognition performance of ConductorMotion100 in the public dataset was better than that of the self-built dataset CCD. The accuracy of H2SCNN model was as high as 97.8%, which was 7.1% higher than VGG19 static image processing, and the parameter size was only 32.7 M. The experimental results show that the H2SCNN model, combined with spatial and temporal information, significantly improves the accuracy of pose recognition, achieves good real-time performance, and greatly ensures the smooth performance of the choir.
Keywords
Get full access to this article
View all access options for this article.
