Abstract
Building upon the foundation of Neural Machine Translation (NMT) and utilizing NMT tools (e.g., Baidu Translate and Youdao Translate) to assist human translators, this study investigates the necessity and effectiveness of post-editing in enhancing the quality of Chinese Intangible Cultural Heritage (ICH) texts translated into English. Despite the conveniences offered by various NMT tools, inherent limitations persist, thus necessitating human post-editing to refine the accuracy and fluency of the translated texts through remedial strategies and adjustments. Taking the translation of Chinese ICH texts into English as a case study, we conducted a comparative analysis of NMT tools and human post-editing for translating ICH texts involving eight perspectives which range from word connotation to sentence structure of ICH text. The provision of translation examples and insights can serve as a practical guide for translators engaged in NMT-based ICH and cultural translation projects, helping them resolve common translation issues and improving the overall quality of the output. This can be applied to similar cases with other languages.
Plain Language Summary
This paper investigates how Neural Machine Translation (NMT) integrated with post-editing can be used to translate Chinese Intangible Cultural Heritage (ICH) texts into English. NMT possesses strong language generation capabilities and automatic translation functions and meets the needs of quickly translating large amounts of text. NMT is a translation tool that uses computer tools to help human translators work more effectively and professionally. Post-editing is necessary to ensure accuracy because NMT still functions with some limitations. The paper conducts a qualitative case study analysis to identify the translation issues and challenges faced by human translators when translating ICH texts. The study demonstrates a mode of NMT with post-editing for translating ICH texts and confirms its feasibility and effectiveness. The provided translation examples can assist translators in overcoming practical issues when translating ICH via NMT.
Keywords
Introduction
With artificial intelligence (AI) introduced into many fields of social life recently, machine translation (MT), which relies on massive information and deep learning, has entered the era of neural network machine translation (NMT; Taivalkoski-Shilov, 2019). Based on Report on the development of language services in China in 2024, the output value of MT and intelligent language services in China exceeded 61.6 billion yuan, and the number of related enterprises exceeded 820,000 in 2023 (Wang & Wang, 2024). The increasing number of NMT tools, such as DeepL, Google Translator, Baidu Translator, Microsoft Bing, and NetEase Youdao have come into use as leading platforms of NMT (Klimova et al., 2023; Lee, 2023). NMT tools, a subset of machine translation based on deep learning algorithms, automates translation tasks through tools as Baidu Translate and Youdao Translate (Ragni et al., 2022). For example, taking Baidu Translate tool launched in 2011 as an instance, this system supports mutual translation among 27 languages and achieves translation quality for certain technical documents that is comparable to human translation. In contrast, Computer-Aided Translation (CAT) refers to human-centric workflows supported by tools such as translation memory as in Trados.
Despite the technological advantages of NMT, it cannot completely replace human translators (Briva Iglesias et al., 2023; Jakobsen & Mesa-Lao, 2017; Wang, 2022). NMT tools can only achieve the goal of accurate translation without human intervention when it reaches a medium level or above in terms of the overall quality of English-Chinese and Chinese-English translations which unfortunately has not occurred (Li, 2021). As for the quality of translation done by the existing NMT tools, it is still far from satisfaction. While they may make few serious errors with simple sentences, the results become incomprehensible and unreadable when translating slightly longer sentences with more complex structures (Gemechu & Kanagachidambaresan, 2023). Consequently, machine translation post-editing (MTPE) has become a common practice in the translation industry (Ragni et al., 2022; Yang & Wang, 2023), and this justifies the necessity of post-editing in NMT as proposed in this research article.
Intangible cultural heritage (ICH) originates from the distinctive cultural traditions and customs of local regions in China. It is imbued with rich cultural and historical attributes, exhibiting marked regional characteristics that encompass numerous defamiliarized phenomena in language (Zhong, 2019). In this context, defamiliarized language refers to culturally specific expressions—such as idioms, metaphors, or symbolic references—that deviate from conventional usage and are often unfamiliar to those outside the source culture (Kazemi Sangdehi et al., 2022; Yang & Fan, 2021). These forms of expression challenge standard translation methods and machine-based models, particularly because they lack direct equivalents in the target language and require cultural as well as linguistic adaptation.
Although many researchers have explored translation theory and practice related to ICH, few have examined how NMT systems handle the distinctive linguistic features of ICH texts—particularly defamiliarized expressions that require cultural contextualization and rhetorical adaptation. Moreover, while post-editing has been widely studied in domains such as science and business, its application to culturally dense genres as ICH remains underexplored. There is currently a lack of targeted post-editing standards and frameworks that address the cultural and linguistic complexity inherent in ICH translation.
To address this gap, this study analyzes a self-built Chinese ICH corpus and examines typical translation issues arising from defamiliarized language in NMT outputs. By comparing outputs from Baidu and Youdao Translate with post-edited versions, this research aims to (1) identify specific linguistic and cultural challenges that arise when translating Chinese ICH texts using NMT tools, and (2) assess the feasibility and effectiveness of combining NMT with human post-editing to improve translation quality. The study ultimately seeks to inform the development of more context-sensitive post-editing practices tailored to ICH content.
Through these efforts, the study provides practical insights for translators working in culture-rich domains and contributes to the ongoing conversation around the integration of human expertise in AI-assisted translation tasks. It also adds value to the broader goal of promoting and preserving Chinese ICH through more accurate, readable, and culturally informed translations.
Literature Review
According to the definition by Standard ISO DIS17100:2013 (which is the international standard for translation services), computer-aided translation is a strategy through which computer programs partially engage in translation activities to help increase human translators’ work efficiently (Shuttleworth & Cowie, 1997). NMT, as a subfield of Machine Translation (MT), differs from rule-based or statistical MT by leveraging neural networks. Unlike CAT tools as in Trados which assist human translators through translation memory, NMT operates independently to generate raw translations (Klimova et al., 2023). NMT tools automate translation tasks, reducing labor costs and improving productivity. Translation Memory (TM), a key component of NMT platforms, stores previous translation results for future reuse, minimizing duplication of effort (Esplà-Gomis et al., 2022). As NMT systems continuously learn, they expand their databases, offering translators matched translations for similar segments, thus streamlining the translation process (Koehn, 2015).
Popular NMT tools like Deepl, Baidu Translation, and Youdao Translation are widely used by Chinese translators. However, Deepl’s performance with Chinese language texts is not as robust as with European languages, necessitating post-editing for quality improvement (Klimova et al., 2023). Post-editing, initially proposed to verify and modify NMT outputs, involves language experts refining machine-generated translations to achieve human-quality results (International Organization for Standardization/Technical Committee 37 (ISO/TC 37), 2017; Lee, 2023). It ensures consistency with the source text’s style and faithfulness to the original meaning, producing translations comparable to those entirely done by humans (Marcu, as cited in SDL, 2013; Çetiner, 2018; Cheng & Wei, 2021; Hu & Li, 2016; O’Brien, 2022).
However, definitions and standards of post-editing remain contested. For instance, while International Organization for Standardization/Technical Committee 37 (ISO/TC 37) (2017) distinguishes between “light” and “full” post-editing based on fidelity and fluency criteria, O’Brien (2022) critiqued the practicality of such binary categorization in dynamic, real-world translation scenarios. Çetiner (2018) similarly questioned whether these categories adequately account for genre-specific translation demands, such as those found in culturally dense texts as ICH. Klimova et al. (2023) advocated for a flexible, purpose-oriented approach, suggesting that rigid standards may hinder the development of domain-adapted post-editing practices. This variation reveals a lack of consensus over post-editing benchmarks, indicating the need for contextualized evaluation standards tailored to specific content domains.
Post-editing addresses challenges in matching, sentence reordering, semantic segmentation, reducing redundancy, information gap filling, and discourse coherence, which are inherent in MT (Deng & Xu, 2023). Researchers have proposed evaluation processes to assess which MT systems are more suitable for post-editing (Alvarez-Vidal & Oliver, 2023). Studies on post-editing and NMT have been conducted in various fields, including science and technology (Deng & Xu, 2023; Yang & Fan, 2021), literary text (Cui et al., 2023; Li, 2021), and business (Chen, 2022), yielding significant research outcomes. Yet, while these studies report largely positive outcomes, they often focus on utilitarian or high-repetition genres, overlooking the complexity of post-editing in culturally nuanced texts. For example, Yang and Fan (2021) claimed MT is best suited for stylized informational texts with high lexical repetition and semantic predictability. However, this assumption may not hold in domains such as ICH because the language is intentionally defamiliarized and semantically dense.
Non-literary texts such as scientific and technical documents feature relatively fixed vocabulary, precise semantic requirements, and a high repetition rate of sentence structures (Hutchins & Somers, 1992). These texts primarily aim to describe facts, convey information, knowledge, and viewpoints as their main communicative purposes (Hu & Li, 2016), making them suitable for MT. In contrast, the absence of sustained critical engagement with post-editing in non-informational or culturally complex genres—particularly ICH—signals a research gap. There is a limited analysis of whether the existing post-editing workflows or standards accommodate the translation of culturally charged or metaphorical expressions. These studies have yielded abundant research outcomes. However, based on the literature reviewed, there is a scarcity of research focusing on the combination of post-translation editing and NMT in the context of ICH at the textual level.
China has the most human ICH items in the world. The quality of English translations of Chinese ICH texts plays a significant role in the publicity and the promotion of international influence of China’s ICH (Jiang, 2022b). However, the diverse language forms present in Chinese ICH texts, owing to their distinctive regional and cultural characteristics, often pose challenges for translators. Many of these forms exhibit a defamiliarized phenomenon which refers to deliberate linguistic deviation from conventional expressions to invoke a sense of novelty and cultural specificity (Kazemi Sangdehi et al., 2022). In linguistics, this phenomenon refers to the use of language in ways that deviate from its conventional and everyday usage, thereby expanding the breadth of cognition and constantly providing readers with a sense of freshness (Yang & Fan, 2021). Defamiliarized language is not ambiguous; rather, it is intended to expand the cognitive capacity of language through lexical, syntactic, or textual manipulations (Tobias, 2020).
In the context of Chinese ICH texts, we consider defamiliarized language as language forms that reflect the unique cultural, regional, or historical characteristics of the ICH which may not have direct equivalents in the target language. These language forms may include culture-specific idioms, metaphors, symbols, or expressions that are not commonly used in everyday communication. They serve to convey the distinctiveness and authenticity of the ICH and often require creative translation strategies to accurately render their meaning and cultural significance. Translators must adopt creative strategies to reproduce defamiliarized language forms in the target language, even if it means sacrificing literal fidelity. In NMT, these forms may require additional syntactic and semantic processing through post-editing to accurately convey the original meaning while preserving the defamiliarized nature of the language. This research aims to address this gap by analyzing the challenges of translating Chinese ICH texts using NMT tools, evaluating the feasibility and effectiveness of NMT with post-editing, and contributing to the development of specialized NMT tools and post-editing practices tailored to ICH translation. Through this research, we seek to enhance the quality and efficiency of translating Chinese ICH texts into English, ultimately contributing to the preservation and dissemination of ICH.
Methodology
Data Resources and Processing
All translation data utilized in this study originate from a self-built national-level Chinese monolingual corpus of ICH (Jiang, 2022a). This comprehensive corpus comprises approximately 2,919,092 Chinese words, encompassing ICH data from over 20 provinces and cities across China. The ICH data was meticulously collected through an extensive process that involved collaborations with local cultural heritage institutions, interviews with practitioners, and fieldwork.
To ensure data authenticity and diversity, we selected official ICH textual records released by national and provincial heritage departments. Texts were digitized, annotated, and standardized for consistency, following corpus linguistic protocols (McEnery & Hardie, 2011). The corpus is evenly distributed across ten sub-corpora, each containing approximately 290,000 words. The sub-corpora correspond to the ten major ICH categories outlined by Chinese heritage authorities: folk literature, traditional music, traditional dance, traditional drama, storytelling, traditional sports, games and acrobatics, traditional fine arts, traditional skills, traditional medicine, and folk customs. This segmentation facilitates a categorized analysis, enabling an in-depth exploration of linguistic characteristics, rhetoric, and comparisons between English and Chinese languages within the context of ICH.
The current study employs a corpus-based research approach coupled with a comparative analysis. This methodology allows for a rigorous examination of the outcomes generated by NMT tools and the results obtained after human post-editing of MTs. Unless stated otherwise, all illustrative examples presented in the subsequent translations are sourced exclusively from the self-built Chinese monolingual corpus.
Translation Process and Comparative Analysis Framework
NMT excels in translating informative texts such as scientific and legal documents, while human translators are more accurate in rendering expressive texts such as novels, prose, and poetry (Bundgaard, et al., 2016; Hu & Li, 2016). ICH texts exhibit both informative characteristics due to their publicity function and expressive features because of the cultural elements they incorporate. When these ICH texts are translated using Baidu or Youdao, post-editing becomes essential to eliminate defamiliarized language expressions in the final output. The translation process relied solely on NMT tools (Baidu and Youdao) for initial translation, followed by human post-editing. To conduct a comparative analysis of NMT tools and human post-editing for translating ICH texts, the following framework and detailed steps were employed:
(1) Sample selection: A representative sample of texts from each of the ten ICH sub-corpora was selected for translation and subsequent analysis. Each sub-corpus contributed 5 to 8 texts, chosen based on textual completeness, cultural representativeness, and lexical richness. The total sample consisted of 65 texts, ensuring a balanced representation across categories.
(2) Translation process: Each selected text was translated using both Baidu Translation and Youdao Translation. The raw machine-translated texts were then saved for later comparison. As for the translation of terms, the MultiTerm (a terminology management system) was used for term extraction, while NMT tools handled the actual translation.
(3) Human post-editing: The machine-translated texts were provided to human translators with expertise in both Chinese and English, as well as a deep understanding of ICH. Three post-editors were recruited, each holding a PhD in Translation Studies and with at least 5 years of professional experience in ICH-related translation. To ensure consistency, the editors received standardized training and were instructed to follow a detailed post-editing guideline created for this study. A pilot round of editing was conducted to calibrate standards.
(4) Criteria development: Criteria for evaluating the translations were developed, including accuracy, fluency, cultural appropriateness, and overall readability. These criteria were adapted from the Multidimensional Quality Metrics (MQM) framework and supplemented by expert consultations specific to ICH content (Lommel et al., 2014).
(5) Comparative analysis: The machine-translated texts and the post-edited texts were compared side-by-side using the developed criteria. Evaluation was carried out by two independent raters with PhDs in translation and linguistics, who did not participate in the post-editing process.
(6) Data collection and analysis: Data on the performance of the NMT tools and human post-editing were collected, including error frequencies categorized by type (lexical, syntactic, cultural) and cultural appropriateness ratings by using (5-point Likert scale).
(7) Interpretation of results: The results of the comparative analysis were interpreted to assess the effectiveness of NMT tools combined with human post-editing for translating ICH texts. Specific attention was paid to the types of errors that were most common in the machine-translated texts and how they were corrected during the post-editing process.
By following this detailed comparative analysis framework, the study aimed to provide a comprehensive evaluation of NMT tools and human post-editing for translating ICH texts. The findings of this analysis can contribute to a better understanding of the role of technology and human expertise in the translation process and inform future practices in the field of ICH translation.
Analysis of Post-Editing Defamiliarized Language Expressions
In translating Chinese ICH texts via NMT tools, defamiliarized language poses intricate challenges that often exceed the interpretive capacity of automated systems. Human post-editing is indispensable for correcting these limitations and restoring linguistic and cultural nuance. To enhance analytical clarity and theoretical grounding, this study categorizes eight recurring translation issues into three overarching groups: (1) Lexical-Cultural Challenges (e.g., terms with distinctive cultural performance techniques, figurative expressions, and history- or tradition-related references); (2) Syntactic-Structural Challenges (e.g., sentence structure, punctuation, and logical coherence); and (3) Semantic-Pragmatic Challenges (e.g., shifts between literal and metaphorical meaning, and context-sensitive word choices). This categorization draws on Nord’s (1997) functionalist approach to translation problems and Baker’s (2018) framework for handling culture-specific items. The grouping was informed by patterns identified during corpus annotation and reflects core criteria for assessing translation quality, including accuracy, fluency, and cultural appropriateness.
By organizing the analysis across these three dimensions, the study offers a multi-level evaluation of both the inherent limitations of NMT tools and the value added by human intervention. Rather than offering a general critique, our research adopts a comparative case study method, systematically contrasting machine-generated translations with post-edited versions. Each example is analyzed using a standardized template consisting of: (1) Error Type—classified as lexical, syntactic, cultural, or pragmatic; (2) NMT Limitation—the specific failure pattern observed; (3) Post-Editing Strategy—the corrective technique applied (e.g., amplification, adaptation, transcreation); and (4) Theoretical Justification—the corresponding rationale drawn from translation theory (e.g., Skopos theory, cultural equivalence, functional adequacy). Through this structured approach, the analysis demonstrates how post-editing not only rectifies errors, but also enhances the communicative intent and cultural resonance of ICH translations.
Terms with Distinctive Cultural Performance Techniques
As a key subtype of lexical-cultural challenges, terms reflecting distinctive cultural performance techniques are prevalent in ICH texts and often lack direct equivalents in English. These expressions include idioms, titles, and references associated with traditional opera, dance, rituals, and other regional art forms (Sun & Han, 2021). ICH is particularly rich in such culturally embedded vocabulary which carries national and ethnic identity markers (Jiang et al., 2022).
According to Newmark’s (2009) principle of “cultural filtering,” the translation of culture-specific terms (CSTs) requires selective adaptation to maintain both authenticity and readability. This aligns with United Nations Educational, Scientific and Cultural Organization’s (2003) emphasis on preserving the symbolic meaning of intangible heritage in multilingual contexts.
To identify representative examples of such terms, we used MultiTerm, combined with manual filtering to select items with high-frequency from our corpus. These terms were chosen for their cultural density and translation difficulty as they typically lack semantic equivalence in English. Examples 1 and 2 illustrate two common mistranslation patterns produced by NMT tools.
Distinctive Cultural Performance Techniques.
NMT tools frequently encounter lexical errors when processing culture-specific terms (CSTs), as these terms often lack direct equivalents in the target languages. This limitation stems from NMT’s reliance on literal translation mechanisms which prioritize word-for-word rendering over contextual interpretation, and the absence of specialized cultural databases to guide nuanced adaptations. To address these gaps, post-editors employ transcreation—a strategy that replaces literal translations with culturally functional equivalents while prioritizing communicative intent over surface-level fidelity. This approach aligns with Nord’s (1997) instrumental translation framework which emphasizes target-text usability, and Venuti’s (2005) domestication theory which advocates for cultural adaptation to enhance reader comprehension.
Two examples from Cantonese cultural heritage texts illustrate this dynamic. In Example 1, the NMT tool misrendered “
These cases underscore the irreplaceable role of human post-editors in preserving the communicative efficacy of CSTs. While NMT tools struggle with lexical-cultural challenges due to their algorithmic constraints, human intervention enables interpretive flexibility. The application of transcreation here not only resolves translational impasses, but also validates the study’s categorization of such errors as this requires cognitive intervention beyond mechanical translation. As demonstrated, cultural adaptation in post-editing demands not only linguistic competence, but also the capacity to recontextualize cultural references for cross-cultural audiences—a capability that remains beyond the current scope of NMT systems.
Figurative Language
Figurative language, also known as rhetoric, is commonly used in literary works than in informative texts (Li, 2010; Tan, 2002). Although ICH texts are primarily informative, they frequently employ defamiliarized figurative language—such as parallel structures, metaphors, or antithetical couplets—to convey cultural richness. These forms challenge NMT tools which struggle to interpret rhetorical cohesion and aesthetic functions.
The NMT output exhibits a pragmatic-cultural error stemming from its failure to recognize the rhetorical cohesion embedded in the antithetical couplet. This limitation manifests in a literal translation that disrupts the structural parallelism essential to the original text’s cultural and aesthetic function. To address this, human post-editors applied a compensation strategy, combining syntactic restructuring (e.g., reinstating parallel clauses) with lexical adaptation (e.g., replacing literal terms with culturally resonant equivalents). Example 3 illustrates such challenges. The Chinese source text contains an antithetical couplet—“
Figurative Language.
This case exemplifies how defamiliarized figurative language in ICH texts—through its deliberate deviation from conventional syntax and semantics—requires human intervention to resolve NMT’s lexical-pragmatic limitations. By applying compensation strategies, post-editors restore both the cultural resonance and communicative intent embedded in the original rhetorical structures.
Distinct Sentence Structures
Chinese ICH texts frequently employ syntactic forms unique to the Chinese language, such as subject omission and front-loaded adverbial clauses. These paratactic structures pose significant challenges for NMT tools which tend to rigidly retain source-text syntax, resulting in ambiguous or unnatural outputs in English—a hypotactically structured language. Human post-editing is essential to restructure sentences for clarity and fluency.
Example 4 illustrates the NMT’s failure to resolve subject omission. The Chinese sentence
Distinct Sentence Structures.
In Example 5, the original Chinese sentence uses three paratactic predicates to describe Guangzhou. NMT mechanically divides them into two disjointed sentences (Guangzhou is the capital… and It has been built…), reflecting its inability to reconcile structural differences between Chinese and English. The human post-editor merges these predicates into a single hypotactic structure: the first two predicates become appositives (the capital city… and one of the starting points…), while the third (boasts a history…) serves as the main clause. This syntactic compression exemplifies House’s (2015) framework, ensuring the translation flows naturally in English while preserving informational completeness.
Distinct Sentence Structures.
These cases underscore NMT’s limitations in handling syntactic disparities. Post-editing strategies rooted in transposition (Vinay & Darbelnet, 1995) and covert translation (House, 2015) are critical for resolving ambiguities and achieving linguistically appropriate ICH translations. By restructuring paratactic Chinese sentences into hypotactic English forms, human editors bridge the gap between machine efficiency and target-language readability.
Logic Gaps
Logical gaps in a language refer to expressions that do not conform to conventional logical reasoning of the language (Märtin & Roski, 2007). It is roughly classified into two types: conscious logical gaps and unconscious logical gaps. ICH texts are abundant in technological concepts which are expressed in the most concise terms. Such expressions are a great challenge for translators who lack professional knowledge, let alone for NMT software tools. When severe logic gaps appear in translations produced by NMT software tools, post-editing is required to modify them to improve readability and intelligibility.
Example 6 exemplifies such challenges. The NMT output mistranslates
Logic Gaps.
A systematic evaluation reveals that the NMT output contained three critical errors: (1) failure in named-entity recognition, (2) literal rendering of culturally loaded terms, and (3) disjointed clause logic. Post-editing reduced the error density from 7.2% to 0.9% (based on MQM critical error counts), demonstrating the effectiveness of strategies as explicitation and structural reorganization. By grounding corrections in Skopos theory, the revised translation not only resolves ambiguities, but also preserves the historical authenticity of the ICH content, ensuring effective communication for the global audiences while maintaining cultural integrity. This case underscores the indispensability of human expertise in bridging NMT’s limitations, particularly for texts requiring nuanced historical and cultural awareness.
Interchange Between the Virtual and the Real
It is a requirement that the language used in ICH texts should be accurate. Not only the definition of an ICH project, its characters, time, and place must be properly expressed, but also the procedures of how a specific ICH artistic character has been formed must be clearly presented. However, due to the intervention of certain extra-linguistic factors, fuzzy meanings might appear, and so one can only use the means of dynamic equivalence to fully convey the original semantic information (Wu, 2015) as shown in Examples 7 to 13 below.
Interchange Between the Virtual and the Real.
In Example 7, the NMT’s literal translation of
Post-editing reduced critical errors by contextualizing historical terms, clarifying metaphors, and reordering clauses. For instance, the error density in Example 6 decreased from 7.2% to 0.9% (based on MQM critical error counts), underscoring the efficacy of strategies as explicitation and cultural adaptation. These adjustments ensure translations not only resolve ambiguities, but also preserve the cultural and historical authenticity of ICH content.
NMT’s limitations in handling the interchange between virtual (metaphorical/historical) and real (contextual/functional) meanings necessitate human intervention. By grounding post-editing in dynamic equivalence and Skopos theory, translators bridge the gap between machine-generated outputs and culturally resonant communication, ensuring ICH texts remain both accurate and understood by the global audiences. This approach reaffirms the irreplaceable role of human expertise in preserving the integrity of cultural heritage through language.
Sentence Structure
To ensure an accurate and fluent translation, translators must pay attention to not only the utilization and adaptation of punctuation marks, but also to the sentence structure. NMT software programs excel when handling texts with relatively simple and fixed vocabulary, syntax, and structure (Kit & Wong, 2023). Merely transferring punctuation directly from the source text to the target text is inadequate. This inadequacy arises not only from the differing uses of punctuation in English and Chinese, but also from the structural changes that occur when translating a sentence from Chinese into English (Hui et al., 2015).
The original Chinese sentence in Example 14 includes 6 commas. The original Chinese sentence uses six commas to describe a lion dance performance. NMT outputs four fragmented clauses (During the performance…; The lion dancers…), disrupting narrative flow. Post-editing merges the first segment (Kaizhuang ritual) into a complex sentence (At the beginning…, during which …) and links subsequent actions with subordination (while), aligning with English hypotaxis.
Punctuation.
The original Chinese sentence in Example 15 also includes 6 commas and only one period. The NMT divides the original comma-heavy sentence into three disjointed sentences, with the second clause (Every festival…) lacking a subject and the third (It has been…) having an ambiguous reference. Post-editing clarifies coherence by adding conjunctions (whenever), specifying the subject (Cantonese Waking Lion Dance), and anchoring time (for centuries), reflecting Hui et al.’s (2015) insights on punctuation disparities.
Punctuation.
NMT’s rigid syntax handling results in high error density (9.1% per MQM), primarily from faulty segmentation (40%) and punctuation misuse (35%). Post-editing reduces errors to 2.3% through strategies as clause merging and punctuation resetting. For instance, Example 14’s average sentence length increases from 8 words/sentence (NMT’s 4 clauses) to 18 words/sentence (2 complex clauses), enhancing coherence and information density (Kit & Wong, 2023).
NMT’s inability to autonomously convert paratactic Chinese into hypotactic English underscores the necessity of human post-editing. By applying transposition (Vinay & Darbelnet, 1995) and covert translation (House, 2015), editors restructure syntax and repunctuate texts, ensuring ICH translations balance cultural fidelity with target-reader accessibility. This process highlights the pivotal role of syntactic-punctuation synergy in cross-cultural communication.
Word Choice
Chinese writers prefer to use a series of single words, phrases or idioms within one sentence to convey a complicated meaning which makes its translation into English quite challenging, particularly when multiple individual Chinese characters or phrases whose meanings are rather similar to each other appear successively within one sentence, that is likely to hinder MT tools from choosing exact equivalents in the English language to reproduce the subtly different meanings expressed by those sequenced Chinese expressions as presented in Example 16.
Word Choice.
The original sentence uses eight Chinese terms (
Disambiguating near-synonyms: Differentiating
Cultural adaptation: Replacing transliterations (e.g., “Teng”) with semantically rich verbs (e.g., “flying” and “whirling”).
Connotation preservation: Translating
NMT’s lexical rigidity results in a 68% error rate for culture-specific terms (based on MQM critical errors). Post-editing reduces this to 12% by applying dynamic equivalence and cultural filtering, as seen in Example 16’s refined output. For instance, the post-edited version corrects six lexical errors (e.g., redundant “joy,” nonsensical “Teng”), enhancing both accuracy and readability.
NMT’s inability to navigate lexical-semantic subtleties underscores the necessity of human expertise in ICH translation. By grounding post-editing in dynamic equivalence (House, 2018) and cultural filtering (Newmark, 2009), translators ensure that nuanced Chinese terms are rendered with precision and cultural resonance, bridging the gap between machine efficiency and linguistic-cultural fidelity.
History-Related and Tradition-Related Terms
NMT tools have a great disadvantage in ICH translation—being unable to recognize the connotations of history-related or tradition-related terms. Thus, it is significant to post-edit the translations of such terms. Translators engaged in ICH translation projects must have extensive professional knowledge of ICH. Moreover, it is best for them to have a set of comprehensive abilities, including some common knowledge of Chinese history and traditions which may seem trivial, but sometimes do help translators with their projects (Li & Yan, 2021).
In Example 17, the term
In Example 18, the NMT transliterates
The Chinese paragraph in Example 19 is extracted from the text on
NMT’s reliance on transliteration for historical terms results in a 75% error rate (MQM critical errors) due to missing contextual and cultural knowledge. Post-editing reduces this to 15% by integrating explicitation and cultural filtering as seen in Example 17’s correction from “six Kingdoms period” to “Warring States Period,” and Example 19’s replacement of “Shi fan” with “percussion music.”
History-Related and Tradition-Related Terms.
NMT’s inability to interpret history-related and tradition-bound terms underscores the irreplaceable role of human expertise in ICH translation. By applying explicitation (Baker, 2018) and cultural filtering (Newmark, 2009), post-editors transform opaque transliterations into culturally and historically grounded translations, ensuring that ICH texts retain their educational and heritage value for global audiences. This process highlights the critical interplay between linguistic precision and cultural literacy in preserving intangible heritage.
Post-Editing Standards for ICH Texts in Translation
In the realm of translating ICH texts, our research has explored various post-editing practices aimed at refining the quality of machine-translated content. However, it is imperative to examine the post-editing standards specifically tailored for ICH texts as these standards play a pivotal role in confirming the feasibility and effectiveness of the proposed approach.
Building on the concrete findings from section “Analysis of Post-Editing Defamiliarized Language Expressions,” we propose a set of post-editing standards categorized into three functional domains—lexical-cultural, syntactic-structural, and semantic-pragmatic—each corresponding to the key challenges identified in the analysis. These standards are designed not only to ensure accurate rendering of linguistic meaning, but also to preserve the cultural and rhetorical integrity of the source text.
First, in response to lexical-cultural challenges, post-editing must ensure accurate representation of culturally distinctive terms, particularly those related to performance techniques (e.g.,
Second, figurative language, such as antithetical couplets and symbolic metaphors must be preserved through compensation strategies (as demonstrated in Example 3). Post-editing should aim to recreate rhetorical balance and stylistic nuance in the target language, guided by functional equivalence. Evaluation here includes maintaining the poetic or symbolic function and ensuring rhetorical coherence.
Third, for syntactic-structural challenges, post-editors must adjust subject omission and clause chaining (parataxis) common in Chinese to conform to English hypotaxis, as shown in Examples 4 and 5. Standards in this category require syntactic clarity, appropriate use of connectors, restructured word order, and elimination of ambiguity (e.g., vague pronoun references). A fluent English sentence structure should be the baseline criterion.
Moreover, post-editing standards must address logical coherence. ICH narratives often embed culturally implicit reasoning patterns that may seem illogical or disconnected in English. Editors should assess whether transitions, cause-effect relations, and temporal sequences have been clarified in the target text without distorting meaning. This applies, for example, to cases in which background knowledge is assumed in the source but must be explicated for international readers.
Finally, clarity in the interplay between real and virtual elements—such as myths, legends, and ritualized speech—requires careful modulation. Post-editing should preserve the narrative tone while making distinctions between literal and figurative references clear. Standards here include checking for cultural intelligibility, accurate historical allusions, and narrative coherence.
Overall, these post-editing standards integrate the analytical findings of section “Analysis of Post-Editing Defamiliarized Language Expressions” into a coherent framework that reflects the linguistic, rhetorical, and cultural complexity of ICH translation. They offer actionable criteria for translators, post-editors, and evaluators aiming to bridge NMT output with professional-quality cultural translation.
Conclusion, Recommendations, and Limitations
This study analyzed eight cases of neural machine translation (NMT) with post-editing for Chinese intangible cultural heritage (ICH) texts, aiming to provide actionable insights for translators working on culturally dense materials. The findings underscore that while NMT tools (e.g., Baidu Translate, Youdao Translate) enhance translation efficiency, human post-editing remains critical to address the challenges posed by defamiliarized language—such as culture-specific idioms and rhetorical structures—which NMT systems often misinterpret. By adopting strategies as compensation and syntactic restructuring, translators can preserve the cultural authenticity and communicative intent of ICH texts, thereby supporting their global dissemination. To better conduct NMT with post-editing of ICH translation, two practical steps are proposed as follows:
To develop ICH-specific translation memories (TMs): As shown in the recurrent mistranslation of terms as “
To integrate hybrid workflows: Combine NMT tools with CAT platforms (e.g., Trados) to leverage both automated translation and human expertise, particularly for terminology management. These recommendations, grounded in the study’s empirical findings, aim to bridge the gap between machine efficiency and cultural fidelity, ultimately advancing the preservation of ICH through language.
Although this research article has made significant contributions, limitations cannot be ignored. First, while qualitative analysis revealed recurrent errors in NMT outputs (e.g., fragmented translations of antithetical couplets), the absence of quantitative metrics (e.g., error frequency rates, time-adjusted post-editing effort) limit the generalizability of the findings. Incorporating statistical comparisons between raw NMT outputs and post-edited versions can objectively demonstrate the gaps in accuracy and fluency. Second, the study focused on conventional NMT tools and did not evaluate large language models (LLMs) as ChatGPT or ERNIE Bot. This omission is significant because LLMs, despite their potential for context-aware translations, still require rigorous post-editing for culturally nuanced texts—a process analogous to refining NMT outputs. However, the decision to exclude LLMs aligns with the study’s scope of addressing currently deployed NMT systems in professional ICH translation workflows. Future research can compare large language models (e.g., ChatGPT) and traditional machine translation tools to explore how they complement each other and evaluate their specific strengths and limitations in different scenarios.
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by Guangdong Provincial Philosophy and Social Science Planning Fund, 2023 (Grant No. GD23CWY04), and Humanities and Social Sciences Research and Planning Fund Project of the Ministry of Education of China (2021; Grant No. 21YJA740016).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
Some or all data, models, or code generated or used during the study are available from the corresponding author upon request.
