Sage Journals: Discover world-class research

Abstract

Virtual Reality (VR) transforms second language acquisition by immersing learners in interactive, context-rich environments that foster cognitive engagement and enhance phonemic competence. This research explores the neurocognitive enhancements induced by VR-based English phoneme learning through multimodal bio-signal analysis and advanced machine learning (ML) techniques. A total of 450 English learners participated in immersive VR scenarios designed to target challenging English phonemes within authentic conversational tasks. Two types of datasets were collected: one in the form of a CSV file containing EEG signals and eye-tracking data, and the other comprising audio signal data. EEG and eye-tracking data were preprocessed using Z-score normalization to ensure consistency. Audio data were denoised using the Savitzky–Golay filter, which effectively preserves phonetic information while removing environmental noise. The cleaned data were fed into the feature extraction process. For the EEG and eye-tracking data, feature extraction was carried out using Independent Component Analysis (ICA), while Mel-frequency cepstral coefficients (MFCCs) were extracted from the audio data to capture detailed phonetic features essential for phoneme classification. This approach ensures accurate classification of phonemic performance and prediction of neurocognitive load in immersive VR-based phoneme learning. A feature-level fusion technique was employed to integrate the normalized event-log features and audio-based MFCCs into a unified, high-dimensional feature space, enabling comprehensive multimodal analysis. The Manta Ray Foraging Optimized Light Gradient-Boosting Machine (MRFO-LGBM) was introduced to optimize the LGBM model, enabling accurate classification of phonemic performance and prediction of neurocognitive load. The proposed method was implemented using Python 3.10.1. Experiments demonstrate that the proposed VR-enhanced cognitive phoneme recognition framework significantly outperforms other models, achieving superior results in terms of accuracy, F1-score, precision, and recall, with all metrics ranging from 95% to 96% in predicting neurocognitive states during immersive language acquisition. This research introduces a novel, scalable VR-based system that integrates bio-signal fusion and intelligent modeling to deliver personalized, measurable improvements in phonemic competence.

Graphical Abstract

Keywords

virtual reality (VR)phonemic competence neurocognitive enhancements multimodal bio-signals manta ray foraging optimization (MRFO)machine learning (ML)feature-level fusion light gradient-boosting machine (LGBM)

Get full access to this article

View all access options for this article.

References

Natvig

. Modeling heritage language phonetics and phonology: toward an integrated multilingual sound system. Languages 2021; 6(4): 209. DOI: 10.3390/languages6040209.

Habbash

Mnasri

Alghamdi

, et al. Recognition of Arabic accents from English spoken speech using deep learning approach. IEEE Access 2024; 12: 37219–37230. DOI: 10.1109/ACCESS.2024.3374768.

Coumel

Groß

Sommer-Lolei

, et al. The contribution of music abilities and phonetic aptitude to L2 accent faking ability. Languages 2023; 8(1): 68. DOI: 10.3390/languages8010068.

Karpovich

Sheredekina

Krepkaia

, et al. The use of monologue speaking tasks to improve first-year students’ English-speaking skills. Educ Sci 2021; 11(6): 298. DOI: 10.3390/educsci11060298.

Liu

Morris

, et al. Knowledge-based features for speech analysis and classification: pronunciation diagnoses. Electronics 2023; 12(9): 2055. DOI: 10.3390/electronics12092055.

Altakhaineh

AL-Junaid

Younes

. Pronunciation and spelling accuracy in English words with initial and final consonant clusters by Arabic-speaking EFL learners. Languages 2024; 9(12): 356. DOI: 10.3390/languages9120356.

O’Brien

Seward

Zhang

. Multisensory interactive digital text for English phonics instruction with bilingual beginning readers. Educ Sci 2022; 12(11): 750. DOI: 10.3390/educsci12110750.

Mamani-Calapuja

Laura-Revilla

Hurtado-Mazeyra

, et al. Learning English in early childhood education with augmented reality: design, production, and evaluation of the “Wordtastic kids” app. Educ Sci 2023; 13(7): 638. DOI: 10.3390/educsci13070638.

Cho

Kim

. Production of mobile English language teaching application based on text interface using deep learning. Electronics 2021; 10(15): 1809. DOI: 10.3390/electronics10151809.

10.

Rodríguez-Cano

Delgado-Benito

Ausín-Villaverde

, et al. Design of a virtual reality software to promote the learning of students with dyslexia. Sustainability 2021; 13(15): 8425. DOI: 10.3390/su13158425.

11.

Wilang

Seepho

Kitjaroonchai

. Exploring the relationship of reading fluency and accuracy in L2 learning: insights from a reading assistant software. Educ Sci 2025; 15(4): 488. DOI: 10.3390/educsci15040488.

12.

Yang

Wang

, et al. Mobile application-based phonetic training facilitates Chinese-English learners’ learning of L2. Learn InStruct 2024; 93: 101967. DOI: 10.1016/j.learninstruc.2024.101967.

13.

LaRocco

Tahmina

Lecian

, et al. Evaluation of an English language phoneme-based imagined speech brain computer interface with low-cost electroencephalography. Front Neuroinform 2023; 17: 1306277. DOI: 10.3389/fninf.2023.1306277.

14.

Juyal

Muthusamy

Kumar

, et al. Resting state EEG assisted imagined vowel phonemes recognition by native and non-native speakers using brain connectivity measures. Phys Eng Sci Med 2024; 47(3): 939–954. DOI: 10.1007/s13246-024-01417-w.

15.

Wan

Wang

, et al. The combination of accent method and phonemic contrast: an innovative strategy to improve speech production on post-stroke dysarthria. Front Hum Neurosci 2024; 17: 1298974. DOI: 10.3389/fnhum.2023.1298974.

16.

Duraivel

Rahimpour

Chiang

, et al. High-resolution neural recordings improve the accuracy of speech decoding. Nat Commun 2023; 14(1): 6938. DOI: 10.1038/s41467-023-42555-1.

17.

Heidlmayr

Ferragne

Isel

. Neuroplasticity in the phonological system: the PMN and the N400 as markers for the perception of non-native phonemic contrasts by late second language learners. Neuropsychologia 2021; 156: 107831. DOI: 10.1016/j.neuropsychologia.2021.107831.

18.

Tolba

Elarif

Taha

, et al. Interactive augmented reality system for learning phonetics using artificial intelligence. IEEE Access 2024; 12: 78219–78231. DOI: 10.1109/ACCESS.2024.3406494.

19.

Yuan

Yang

, et al. Research on project-based learning of foreign trade English in speech recognition virtual reality environment. Soft Comput 2023; 8: 1–2. DOI: 10.1007/s00500-023-08896-1.

20.

Xue

Wang

. English listening teaching device and method based on virtual reality technology under wireless sensor network environment. J Sens 2021; 2021(1): 8261861. DOI: 10.1155/2021/8261861.

21.

Xie

Zhang

Yang

. Effect of immersive virtual reality based upon input processing model for second language vocabulary retention. Educ Inf Technol 2025; 10: 1–21. DOI: 10.1007/s10639-025-13333-x.

22.

Mozaffari

Lee

. Second language pronunciation training by ultrasound-enhanced visual augmented reality. In: 2021 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, 2021, pp. 3043–3050. DOI: 10.1109/BIBM52615.2021.9669622.

23.

Bahameish

Khowaja

Abdelaal

, et al. Pathways to learning: exploring the impact of augmented reality on vocabulary development in children with autism spectrum disorder. Interact Learn Environ 2025; 4: 1–24. DOI: 10.1080/10494820.2025.2485407.

24.

Watthanapas

Hao

, et al. The effects of using virtual reality on Thai word order learning. Brain Sci 2023; 13(3): 517. DOI: 10.3390/brainsci13030517.

25.

Pellas

Christopoulos

. The effects of machinima on communication skills in students with developmental dyslexia. Educ Sci 2022; 12(10): 684. DOI: 10.3390/educsci12100684.

26.

Wang

Liu

. English song lyrics in EFL underachievers’ phoneme categorization. Sage Open 2025; 15(2): 21582440251330350. DOI: 10.1177/21582440251330350.

Immersive language acquisition: Quantifying VR-Induced neurocognitive enhancements in English phonemic competencea

Abstract

Keywords

Get full access to this article

References