Sage Journals: Discover world-class research

Abstract

Focused on the digital preservation and inheritance of dialects, this study illustrates the construction pathway of a digitised multimodal dialect corpus and the application of a dialect interactive learning model using the digital twin technology, taking the Hangzhou dialect as a representative. Initially, multimodal resources of the Hangzhou dialect were collected, and with the aid of digital techniques, these resources underwent annotation, segmentation, transcription, and synchronisation, culminating in the creation of the multimodal dialect corpus. Subsequently, features were extracted using Natural Language Processing (NLP) methodologies from deep learning, facilitating the construction of a Hangzhou dialect lexicon. With the annotated corpus as the foundation, combined with Feedforward Sequential Memory Networks (FSMN) and Long Short-Term Memory (LSTM) networks, acoustic and linguistic models for the Hangzhou dialect were developed, laying the groundwork for a Hangzhou dialect speech recognition system. Conclusively, by integrating digital twin technology, an autonomous dialect inheritance learning model was crafted. This model establishes a twin learning space and learning twin entity founded on auditory, visual, and tactile multimodal information. Utilising virtual reality technology, a dialect learning ecological model was designed to enhance learner agency, offering diverse learning modalities and personalised content, with the overarching goal of supporting the preservation and inheritance of dialects.

Keywords

Digital twin multimodal corpus dialect learning twin entity

Get full access to this article

View all access options for this article.

References

Samo

Ursini

. Geographical maps meet place names where languages meet dialects: the case of Italian. Forum Ital 2023; 57(3): 1019–1040. DOI: 10.1177/00145858231190030.

König

Pfeiffer

Maitz

. Regional dialect in kindergarten: the results of a questionnaire survey in Bavaria-Swabia. Zeitschrift für Dialektol Linguistik 2020; 86(3): 247–283. DOI: 10.25162/zdl-2019-0010.

Han

Zhu

Wen

, et al. Research on dialect protection: interaction design of Chinese dialects based on BLSTM-CRF and FBM theories. IEEE Access 2024; 12: 22059–22071. DOI: 10.1109/ACCESS.2024.3364098.

Kashevnik

Lashkov

Axyonov

, et al. Multimodal corpus design for audio-visual speech recognition in vehicle cabin. IEEE Access 2021; 9: 34986–35003. DOI: 10.1109/ACCESS.2021.3062752.

Qiu

. Research on intelligent calibration of English long sentence translation based on corpus. Secur Commun Network 2021; 2021: 1–9. DOI: 10.1155/2021/5365915.

Cousinard

. How to use oral multimodal corpora and data-driven learning to teach French talk-in-interaction. Eur J Appl Ling 2022; 10(2): 245–256. DOI: 10.1515/EUJAL-2022-0032.

Rodríguez-Peñarroja

. Corpus pragmatics and multimodality: compiling an ad-hoc multimodal corpus for EFL pragmatics teaching. Int J Instr 2021; 14(1): 927–946. DOI: 10.29333/IJI.2021.14155A.

Eijk

authors Arnese

, et al. The CABB dataset: a multimodal corpus of communicative interactions for behavioural and neural analyses. Neuroimage 2022; 264: 119734. DOI: 10.1016/J.NEUROIMAGE.2022.119734.

Razzaq

Shah

Iqbal

, et al. DeepClassRooms: a deep learning based digital twin framework for on-campus class rooms. Neural Comput Appl 2022; 35: 8017–8026. DOI: 10.1007/S00521-021-06754-5.

10.

Martínez-Gutiérrez

Díez-González

Verde

, et al. Convergence of virtual reality and digital twin technologies to enhance digital operators’ training in industry 4.0. Int J Hum Comput Stud 2023; 180: 103136. DOI: 10.1016/J.IJHCS.2023.103136.

11.

Yao

. Automated sentiment analysis of text data with NLTK. J Phys Conf Ser 2019; 1187(5): 052020.

12.

Zahran

Fahmy

Wassif

, et al. Fine-tuning self-supervised learning models for end-to-end pronunciation scoring. IEEE Access 2023; 11: 1. DOI: 10.1109/ACCESS.2023.3317236.

13.

Thukroo

Bashir

Giri

. A review into deep learning techniques for spoken language identification. Multimed Tool Appl 2022; 81(22): 32593–32624. DOI: 10.1007/s11042-022-13054-0.

14.

Oines

Weinstein

Moreno

. Hybrid LSTM-FSMN networks for acoustic modeling. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 2018, pp. 5844–5848. DOI: 10.1109/ICASSP.2018.8461563.

15.

Rani

Ashraf

Jawad

. Cosmic analysis for some parametrized squared speed of sound models in nonzero torsion cosmology. Int J Mod Phys 2022; 31(13): 2250094. DOI: 10.1142/S0218271822500948.

16.

Padha

Sahoo

. QCLR: quantum-LSTM contrastive learning framework for continuous mental health monitoring. Expert Syst Appl 2023; 238: 121921. DOI: 10.1016/J.ESWA.2023.121921.

17.

Kumhar

Ansarullah

Gardezi

, et al. Translation of English language into Urdu language using LSTM model. CMC-Comput Mater Con 2023; 74(2): 3899–3912. DOI: 10.32604/cmc.2023.032290.

18.

Zhang

. Hybrid algorithm for English translation speech recognition based on deep learning model and clustering. Secur Commun Network 2022; 2022: 9308188–9308211. DOI: 10.1155/2022/9308188.

Construction of a multimodal dialect corpus based on deep learning and digital twin technology: A case study on the Hangzhou dialect

Abstract

Keywords

Get full access to this article

References