Sage Journals: Discover world-class research

Abstract

In martial arts action recognition, complex poses, occlusions, and dynamic changes can lead to insufficient estimation accuracy, and traditional methods suffer from inaccurate joint localization and poor recognition of continuous pose jumps. In response to this, the research proposes the Dilated Convolution-Attention-Stacked Hourglass Network Pose Estimation (MS-DConv-Att-SHN) to achieve martial arts action recognition. Firstly, multiple dilated convolutions and mixed attention are proposed to improve the computational complexity and loss of joint information in stacked hourglass networks, enhancing the ability to capture subtle joint displacements. This is crucial for accurate estimation of complex poses such as movement, flicker, and virtual real transformations in martial arts. Secondly, in response to the strong coherence of martial arts movements, local feature refinement and channel fusion techniques are used to enhance the correlation analysis between consecutive action frames, solving the problem of traditional methods’ fragmented recognition of action chains such as “exertion contraction.” A martial arts action recognition system based on the improved MS-DConv-Att-SHN method has been developed to better identify individual movements and capture the intrinsic relationships between movements in routines. This provides key technical support for the digital inheritance, intelligent evaluation, and standardized promotion of martial arts, making it more closely aligned with the movement characteristics of martial arts that combine form and spirit. The results indicate that structural improvements to the stacked hourglass network can effectively increase its percentage of correct keypoints (PCK) and mean average precision (mAP) for both datasets, with PCK and mAP values exceeding 92% and 85%, respectively. The average recognition accuracy of attitude keypoints in the MS-DConv-Att-SHN model is superior to other comparison models, with a difference of over 1.2% compared to other models. The improved MS-DConv-Att-SHN model achieves recognition accuracy of over 90% for different martial arts movements, showing smaller parameter counts and PCK values compared to other comparative models. The research method can effectively provide technical support for the automation analysis of martial arts movements, sports training assistance systems, and intelligent martial arts teaching.

Keywords

stacked hourglass network human posture martial art attention module PCK mAP accuracy rate local features

Get full access to this article

View all access options for this article.

References

. Deep learning-based for human segmentation and tracking, 3D human pose estimation and action recognition on monocular video of MADS dataset. Multimed Tools Appl 2023; 82(14): 20771–20818.

Kumar

. A survey on intelligent human action recognition techniques. Multimed Tools Appl 2024; 83(17): 52653–52709.

Topham

Khan

Al-Jumeily

, et al. Human body pose estimation for gait identification: a comprehensive survey of datasets and models. ACM Comput Surv 2022; 55(6): 1–42.

Vikalwe Shakrani

Mathew Kanyangarara

Parowa

, et al. A deep learning model for face recognition in presence of mask. Acta inform Malays 2022; 6(2): 43–46.

Zhao

Guo

. Adaptive enhancement design of non-significant regions of a Wushu action 3D image based on the symmetric difference algorithm. Math Biosci Eng 2023; 20(8): 14793–14810.

. A method for recognising wrong actions of martial arts athletes based on keyframe extraction. Int J Biom 2024; 16(3–4): 256–271.

Husheng

. Martial arts moves recognition method based on visual image. J Inf Process Syst 2022; 18(6): 813–821.

Zhao

Lin

Sun

, et al. A review of state-of-the-art methodologies and applications in action recognition. Electronics-Switz 2024; 13(23): 1–39.

Zou

. Improving human pose estimation based on stacked hourglass network. Neural Process Lett 2023; 55(7): 9521–9544.

10.

Chen

Moreno-Noguer

, et al. 3D human pose, shape and texture from low-resolution images and videos. IEEE T Pattern Anal 2021; 44(9): 4490–4504.

11.

Papic

Sanders

Naemi

, et al. Improving data acquisition speed and accuracy in sport using neural networks. J Sport Sci 2021; 39(5): 513–522.

12.

Chen

. A wushu leg gesture recognition algorithm based on random forest and bone feature extraction (RF-SFE). Int J Hi Spe Ele Syst 2025; 34(1): 2540066.

13.

Cherepov

Eganov

Bakushin

, et al. Maintaining postural balance in martial arts athletes depending on coordination abilities. J Phys Educ Sport 2021; 21(6): 3427–3432.

14.

Yamei

Qiang

. Retracted article: dynamic light collection system based on human posture estimation application in martial arts action teaching simulation. Opt Quant Electron 2024; 56(3): 376.

15.

Echeverria

Santos

. Toward modeling psychomotor performance in karate combats using computer vision pose estimation. Sensors 2021; 21(24): 8378.

16.

Pang

Zhang

. Explainable quality assessment of effective aligned skeletal representations for martial arts movements by multi-machine learning decisions. Sci Rep 2025; 15(1): 323.

17.

Jia

Han

. Retracted article: visual system based on optical sensor in wushu training image trajectory simulation. Opt Quant Electron 2024; 56(4): 501.

18.

Chen

. An interpretable composite CNN and GRU for fine-grained martial arts motion modeling using big data analytics and machine learning. Soft Comput 2024; 28(3): 2223–2243.

19.

Hui

. Visualization system of martial arts training action based on artificial intelligence algorithm. Soft Comput 2023; 1(12): 1–12.

20.

Lei

. Feature extraction-based fitness characteristics and kinesiology of wushu Sanda athletes in university analysis. Math Probl Eng 2022; 1: 5286730.

21.

Liu

Yang

, et al. Recognition of TaeKwonDo kicking techniques based on accelerometer sensors. Heliyon 2024; 10(12): e32475.

22.

Cheng

Wang

. Construction of sports training management information system using AI action recognition. Sci Program 2022; 1: 8393612.

23.

Shang

. Advancing martial arts training: neural network-based recognition and assistance systems in biotechnological applications. J Commer Biotechnol 2024; 29(5): 84–94. DOI: 10.5912/jcb1913.

24.

Hrovatič

Peer

Štruc

, et al. Efficient ear alignment using a two‐stack hourglass network. IET Biom 2023; 12(2): 77–90.

25.

Wang

Zhang

, et al. Uniformer: unifying convolution and self-attention for visual recognition. IEEE T Pattern Anal 2023; 45(10): 12581–12600.

26.

Chen

Kong

, et al. A deep hourglass-structured fusion model for efficient single image dehazing. Multimed Tools Appl 2022; 81(24): 35247–35260.

27.

Luo

. Stacked hourglass networks based on polarized self-attention for human pose estimation. Proc Second IYSF Academic Symposium on Artificial Intelligence and Computer Engineering 2021; 12079: 543–548.

28.

Verma

Srivastava

. Two-stage multi-view deep network for 3D human pose reconstruction using images and its 2D joint heatmaps through enhanced stack-hourglass approach. Vis Comput 2022; 38(7): 2417–2430.

29.

Zhang

Bai

, et al. Animal pose estimation algorithm based on the lightweight stacked hourglass network. IEEE Access 2022; 11(1): 5314–5327.

30.

Gheisari

Hamidpour

Liu

, et al. Data mining techniques for web mining: a survey. Artif Intell Appl 2022; 1(1): 3–10.

31.

Huang

. Stacked attention hourglass network based robust facial landmark detection. Neural Netw 2023; 157: 323–335.

32.

Zhao

Wang

Gong

, et al. Estimating human pose efficiently by parallel pyramid networks. IEEE T Image Process 2021; 30(1): 6785–6800.

33.

Guo

Liu

, et al. Action status based novel relative feature representations for interaction recognition. Acta Electron Sin 2022; 31: 168–180.

34.

Zhang

, et al. A spatial attentive and temporal dilated(SATD)GCN for skeleton-based action recognition. J Intell Technol 2022; 7: 46–55. DOI: 10.1049/cit2.12012.

35.

Piatysotska

Podrіgalo

Romanenko

, et al. Comparative analysis of motor functional asymmetry indicators in athletes of cyclic sports, martial arts, and esports. Phys Educ Stud 2023; 27(4): 212–220.

36.

Manolachi

Chernozub

Tsos

, et al. Modeling the correction system of special kick training in mixed martial arts during selection fights. J Phys Educ Sport 2023; 23(8): 2203–22110.

Martial arts action recognition based on improved method of human pose estimation using stacked hourglass network

Abstract

Keywords

Get full access to this article

References