Translator Style in Technical Classics: A Machine-Learning approach on English Translations of Tian Gong Kai Wu

Abstract

A translator’s style serves as a vital lens for examining the intricacies of the translation process. This is especially notable in translating technical classics, where a tension arises between the source text’s objective nature and the translator’s subjective stylistic choices. To address this tension, objective methods are required to complement subjective interpretation. A machine-learning approach is uniquely suited to this task, as it can impartially quantify stylistic features to reveal strategic decisions that might be obscured by subjective bias. Based on the self-constructed corpora and machine learning approach, this study investigates stylistic patterns in three complete English translations of the Chinese technical classic Tian Gong Kai Wu by applying the CVMI-RRMFT feature selection algorithm to identify the 15 most discriminative linguistic features across lexical, syntactic, and textual levels. These features delineate each translator’s unique style, revealing consistent preferences in linguistic expression. Specifically, Ren’s translation exhibits a scholarly “thick translation” style, marked by lexically rich and modified language, the longest average sentence length with nested syntactic patterns, and prolific use of square brackets for explanatory annotations. Li’s translation reflects a commitment to technical precision and cultural encoding, evident in numerically exact renderings and specialized terminology, a preference for complex phrasal construction, and the unique insertion of traditional Chinese characters within parentheses to assert cultural identity. By contrast, Wang’s translation pursues diplomatic clarity and reader-oriented accessibility through simplified vocabulary, shorter sentences, reduced syntactic complexity, and an overall streamlined use of punctuation to ensure fluent comprehension. Further analysis indicates that these stylistic features reflect not merely unconscious traces, but strategic choices influenced by the dynamic interplay between the translator’s individual agency and external factors. This study demonstrates the efficacy of computational methods in revealing stylistic variation, thereby advancing the empirical study of translation, and enriching the cross-cultural transmission of Chinese technical classics.

Plain Language Summary

How Machine Learning Reveal the Unique Styles of Translators: A Study of Three English Versions of a Chinese Technical Classic

This study uses machine learning and computational analysis to objectively measure how different translators develop distinct styles, even when translating the same technical work. The researchers examined three complete English translations of Tian Gong Kai Wu, a 17th-century Chinese classic on technology and craftsmanship. By applying a specialized algorithm, they systematically identified the 15 most distinguishing linguistic features across thousands of vocabulary, sentence structure, and textual patterns. The analysis revealed clear stylistic differences: one translation used complex sentences and extensive scholarly annotations, targeting academic readers; another emphasized technical precision and incorporated traditional Chinese characters, reflecting the translator's scientific expertise and cultural identity; the third employed simpler vocabulary and shorter sentences to make the ancient text accessible to a general audience. These patterns show that a translator’s style is not accidental but results from a combination of personal expertise, intended readership, and historical context. By transforming subjective impressions into quantifiable data, this machine-learning approach uncovers what can be called the “stylistic fingerprints” of each translator, offering a new, evidence-based way to understand how technical classics are adapted across languages and cultures.

Keywords

translator style computational stylistics machine learning Chinese technical classics

Introduction

Translator style forms a major subfield within translation studies. Its foundational framework was established by Baker (2000), who pioneered the corpus-based analysis of translator style by characterizing it as a unique “fingerprint”—a concept that echoes Hermans’s (1996) notion of the “translator’s voice” embedded in translated texts. Baker defines translator’s style as the recurring linguistic choices a translator makes across their works, focusing on how a translator maintains stylistic consistency across different translations. In contrast, Malmkjær (2003) highlights the translator’s perception of stylistic features in the source text, as well as their preservation, modification, and re-constitution in translation, thereby underscoring the translator’s conscious engagement with the original. Based on this distinction, research on translator style is generally divided into two standards: target-text-oriented studies, which concentrate exclusively on the translated text, and source-text-oriented studies, which consider both source and target texts (Saldanha, 2011). Aligning with Malmkjær’s source-sensitive perspective, this study adopts a source-text-oriented approach. Such research typically employs parallel corpora to compare how different translators reproduce the stylistic features of the original, focusing on the reproduction of stylistic features across translations (D. F. Li et al., 2018). These comparisons reveal that translational distinctiveness manifests across multiple dimensions, including source text selection, translation strategies and etc. (Hu et al., 2017), collectively shaping a style that operates on linguistic, cultural, and strategic levels.

Corpus-based study of translator style investigates the nature, processes, and phenomena of translation by integrating statistical analysis with theoretical interpretation of authentic texts (Baker, 1993). Such a mixed-methods approach helps reduce the subjectivity inherent in traditional qualitative research, thereby enhancing the comparability of findings across studies (Zhu & Li, 2023). In recent years, scholarly interest has increasingly turned toward applying this methodology to literary translation. For example, Mastropierro (2018) and Wijitsopon (2022) have examined how style is transmitted from source to translated texts using various lenses, such as semantic prosody, lexical cohesion, and local textual functions. Nevertheless, conventional approaches to translator style remain limited in both scope and method. First, they tend to focus on formal linguistic features, offering little insight into the reconstruction of functional literary elements such as theme and narrative; methodologically, these studies rely on basic corpus statistics rather than advanced methods like natural language processing (Liu & Han, 2025). More fundamentally, Translation is an activity conducted within real social contexts, and as its primary agent, the translator is distinctly shaped by the social imprint. Therefore, it is necessary to systematically link linguistic patterns to the translator’s habitus, norms, agency or forms of capital (Bourdieu, 1986; Simeoni, 1998), thereby contextualizing professional translation within the wider discursive and social settings.

The integration of computer technology has established the study of translator style as a distinct and interdisciplinary area. A key driver of this development has been computational linguistics, which has introduced innovative methodological tools to the field. As an interdisciplinary domain connecting mathematics, linguistics, and computer science, computational linguistics builds mathematical models and employs computational programs to automatically analyze translated texts. This approach has thus laid a strong methodological foundation for quantitative research in translation, especially in the study of translator style. For instance, by integrating multiple classifiers, Lynch and Vogel (2018) showed that unigram and bigram features can reliably distinguish among different translators’ styles. Similarly, Zhan and Jiang (2016) successfully identified the translators of different Chinese versions of Pride and Prejudice with a support vector machine model.

Despite the great efforts made by researchers in the scope of literature, research on non-literary translation remains relatively scarce. Particularly, in the translation of technical texts, the translator must navigate a core tension between preserving the rigorous accuracy of the source text and ensuring accessibility for the target audience. An excessive focus on factual accuracy—often manifested through rigid adherence to the source text’s linguistic form—tends to overlooks readers’ comprehension habits, potentially resulting in translations that are awkward or obscure (Luo, 2011). This dilemma points to a central, yet underexplored question in translation studies: whether stylistic patterns are merely unconscious linguistic fingerprints or, sociologically speaking, tangible traces of the translator’s agency—a dynamic negotiation between individual habitus and multifaceted contextual constraints (Bourdieu, 1990). We further contend that such stylistic choices constitute strategic negotiations, shaped not only by the translator’s habitus, but also by institutional patronage. This is especially salient in the translation of a national technical classic, where style also functions as a tool for cultural diplomacy and nation branding, tailored to specific readerships. To examine this interplay, our study employs machine-learning approach to objectively analyze and compare stylistic features across three complete English translations of the Chinese technical classic Tian Gong Kai Wu. By integrating these data-driven findings with qualitative scrutiny, we aim to elucidate how individual translational choices interact with broader textual, cultural, and technological dimensions. The study is guided by the following research questions:

Question 1: Which linguistic features are the most discriminative among the three translations as identified by machine learning methods, and how are these features distributed?

Question 2: How are the distinctive styles of the three translators manifested, based on the quantitatively identified features, at both linguistic and extra-linguistic levels?

Question 3: What are the causes underlying the formation of these distinct translator styles?

This study contributes to the field in three main respects. Methodologically, it introduces the CVMI–RRMFT analytical framework to translator-style research, providing a robust approach to feature selection and comparison; and it shifts attention to non-literary texts, combining the examination of a Chinese technical classics with machine-learning techniques. Theoretically, the study bridges computational stylistics with sociological translation studies by using empirical data to hypothesize about the habitus and agency of translators working on technical classics. Practically, the findings are expected to yield concrete implications for the translation and global dissemination of Chinese technical classics, informing strategies that balance fidelity, clarity, and stylistic consistency.

Materials and Methods

Materials

The Chinese technical classic Tian Gong Kai Wu (《天工开物》), authored by Song Yingxing, was first published in 1637 during the Ming Dynasty. Comprising three volumes of 18 chapters and 123 illustrations, it serves as a comprehensive record of advanced techniques in agriculture, handicrafts, and manufacturing in China prior to the Ming Dynasty. Consequently, it holds a significant place in testament to China’s ancient scientific achievements. The term “Tian Gong” refers to natural forces, while “Kai Wu” signifies humanity ingenuity in developing and innovating in accordance with the laws of nature (Yang, 1991). Song Yingxing elucidated not only the intricacies of ancient Chinese technical skills but also integrated profound philosophical concepts regarding the harmony and coexistence between humanity and nature. Charles Darwin characterized Tian Gong Kai Wu as an “authoritative work,” while the British historian of science, Joseph Needham, lauded it as “an important work on engineering in the early 17th century” (Y. M. Wang et al., 2018). Such accolades have solidified its status as a pivotal scientific contribution on the international stage, making it a crucial resource for understanding the evolution of science and technology in ancient China.

The study of English translations of Chinese technical classics has garnered attention from scholars worldwide in the history of science and technology. To date, 12 English translations of Tian Gong Kai Wu have been identified, three of which are complete, while the remainder are excerpts, abridgments, interpretations, or retranslations (Lin & Wang, 2021). The first complete English translation appeared in 1966, nearly three centuries after the original’s publication, indicating a late start for related research. Additionally, a total of 11 international book reviews focusing on Ren’s translation have been published. Scholars (Bodde, 1967; Chan, 1968) have examined issues such as translation gains and losses, translation history (Zhao, 1997), terminology and sentence translation (Y. M. Wang & Xu, 2018), theory-based translation analysis (Lin & Wang, 2022), and multimodal translation approaches (Yu, 2023). These studies chiefly investigate the interplay among the translations, the translators’ role, and the historical contexts in which they were produced. However, existing methodologies often rely on subjective criteria and lack objective approaches, such as corpus-based studies that employ both quantitative and qualitative analyses.

As illustrated in Table 1, the three complete English versions span a period of 45 years. Influenced by factors such as historical context, publishing practices, identities and motivations of the translators, these versions collectively represent the landscape of Tian Gong Kai Wu English translations across time and space. They offer significant comparative value and research relevance.

Table 1.

The Three Translations of Tian Gong Kai Wu.

Item	Ren’s translation	Li’s translation	Wang’s translation
Title	Tien-kung Kai-wu: Chinese Technology in the Seventeenth Century	Tien-Kung-Kai-Wu: Exploitation of the Work of Nature, Chinese Agriculture and Technology in the XVII Century	Tian Gong Kai Wu
Word count	71,330	71,429 words and 1,538 traditional Chinese characters	65,112
Publisher	The Pennsylvania state university press	China Academy	Guang Dong Education Publishing House
Time (Year)	1966	1980	2011
Translator	E-tu Zen Sun & Shiou-Chuan Sun	Li Ch’iao-ping and other 14 translators	Wang Yijing, Wang Haiyan, Liu Yingchun
Nationality	Chinese American	Taiwan, Republic of China	Mainland China
Translator Identity	E-tu Zen Sun: translator of historical works Shiou-Chuan Sun: researcher of mining and metallurgy	Chinese chemical historian and chemical educator, represented by The Chemical Arts of Old China	scholars of scientific and technical classics

Being published works, they are utilized under fair use for stylistic analysis and limited quotation in this non-commercial, critical-comparative study.

Corpus Construction and Preprocessing

Text Digitization and Cleaning

The source materials were the scanned PDFs of the three published translations. Optical Character Recognition (OCR) was performed using ABBYY FineReader 15 to convert these scans into machine-readable text, followed by manual verification to correct errors.

To ensure the analysis focused on the translators’ core stylistic choices in rendering the main text, all paratextual elements were identified and removed. This included: translators’ prefaces, introductions, catalogue, footnotes, image captions, appendices, and chapter headings. The rationale for excluding paratexts is that they often serve distinct metadiscursive functions and follow different genre conventions. Including them would introduce extraneous variables and obscure the systematic linguistic patterns used in the translation of the technical narrative itself. Focusing solely on the body text ensures a controlled comparison of translational style across the three versions. Three cleaned texts were saved in UTF-8 encoded plain-text files.

To focus the analysis on stylistic choices within the core technical narrative, all paratexts (e.g., prefaces, catalogue, introductions, footnotes, image captions and appendices) were excluded. The cleaned body texts were saved as UTF-8 encoded plain-text files. This approach ensures a controlled and consistent basis required for subsequent feature selection. While paratexts offer valuable insights, they primarily reflect the translator’s or publisher’s external metacommentary and framing intentions, rather than a style linguistically embedded within the narrative itself. As such, a focus on the main text alone enables a deeper and more targeted analysis of the linguistic patterns that characterize each translator’s work.

Text Segmentation

Each cleaned text was segmented into 18 analyzable units based on their chapter structure. We used the natural chapter boundaries rather than shorter, arbitrary fixed-length windows for two primary reasons. First, chapters represent coherent thematic and narrative units within the technical text, preserving the integrity of the translator’s stylistic flow and rhetorical structure. Segmenting by fixed windows would artificially cut sentences and disrupt discourse cohesion, potentially obscuring stylistic patterns that operate over longer spans. Second, this approach aligns with the source text’s original organization, ensuring that comparisons across translations are meaningfully anchored in the same content segments. A total of 54 sampling units were saved in UTF-8 encoded plain-text files.

Final Corpus Composition and Validation

To validate the consistency and cleanliness of the processed corpora, a final check was conducted on all files, confirming the absence of residual OCR artifacts, misplaced paratext, or incorrect segmentation. The processed corpora are available upon request for research replication purposes. The final working corpus comprises 54 text files. Basic statistics are presented in Table 2.

Table 2.

Processed Corpus Statistics.

Corpus	Total tokens	Total types	Standardized type-token ratio (per 1000 words)	Number of samples (Chapters)	Avg. sample length (Tokens)	SD of sample length	Sample length range (Min–Max)
Ren’s translation	71,330	7,117	41.54%	18	3,963	2,169	1,597–10,015
Li’s translation	71,429	7,401	40.76%	18	3,968	2,433	1,737–9,884
Wang’s translation	65,112	6,248	38.92%	18	3,617	1,851	1,580–8,306

Note. Li’s translation is calculated based on the number of English words (excluding Chinese words).

Research Tools

The following corpus tools were used for information retrieval and statistical analysis: the CLAWS POS tagger (C7 tagset) for part-of-speech annotation; AntWordProfiler 2.2.1 and Vocab Level Checker for lexical profiling; and the L2 Syntactic Complexity Analyzer for syntactic measurement. Text exploration and analysis were carried out using AntConc 4.2.0 for qualitative, interactive examination and WordSmith 8.0 for systematic quantitative processing. Computational analysis was performed in Python (v3.8+) with pandas (v1.3+) for data handling, NumPy (v1.21+) for numerical computations, scikit-learn (v1.0+) for machine learning modeling and evaluation, and Matplotlib (v3.4+) for visualization. The custom CVMI-RRMFT feature selection algorithm was implemented within this Python environment.

Research Procedure

Feature Collection

Building upon the insight of translation stylistic studies by Leech and Short (1981) and Lynch and Vogel (2018), as well as features noted by scholars in studies of Tian Gong Kai Wu, this study compiles a set of 37 typical features across lexical, syntactic, and textual levels. The complete feature set is provided in Table 3.

Table 3.

Feature Set for Translation Styles.

Category	Subcategory	Feature	Indicator
Lexical	Macro view	Word Length	Average word length
		Lexical Richness	Standard type/token proportion (per 1,000 words)
		Common Word	Proportion of common words
		Lexical Complexity	Proportion of complex words
		Academic Word	Proportion of academic words
		Lexical Density	Proportion of content words
	Micro view	Noun	Proportion of nouns
		Verb	Proportion of verbs
		Adjective	Proportion of adjectives
		Adverb	Proportion of adverbs
		Number	Proportion of numbers
		Quantifier	Proportion of quantifiers
		Preposition	Proportion of prepositions
		Auxiliary Word	Proportion of auxiliary words
		Interjection	Proportion of interjections
		Abbreviation	Proportion of abbreviations
		Hyphenated Compound Word	Proportion of hyphenated compound words
Syntactic	Sentence length	Sentence Length	Average sentence length
	Sentence pattern	Declarative Sentence	Proportion of declarative sentences
		Exclamatory Sentence	Proportion of exclamatory sentences
		Question Sentence	Proportion of question sentences
	Tense	Past Tense	Proportion of past tense sentences
		Present Tense	Proportion of present tense sentences
		Future Tense	Proportion of future tense sentences
	Voice	Passive Voice	Proportion of passive voice sentences
Textual	Punctuation	Comma	Proportion of commas
		Period	Proportion of periods
		Semicolon	Proportion of semicolons
		Round Bracket	Proportion of round brackets
		Square Bracket	Proportion of square brackets
		Dash	Proportion of dashes
		Colon	Proportion of colons
	Cohesion	Connectives	Proportion of conjunctions
		Referents	Proportion of pronouns
			Proportion of first-person pronouns
			Proportion of second-person pronouns
			Proportion of third-person pronouns

Feature Selection

To objectively identify the most discriminative stylistic features among the three translations, this study employs the CVMI-RRMFT feature selection method (H. F. Xu et al., 2021). This method operates on two principles: first, an ideal discriminative feature should exhibit low variation within the same translator’s text (consistency) but high distinction across different translators’ texts (separability). Second, selected features should contain minimal redundant information. To clarify, CVMI quantifies two key aspects of a linguistic feature: its stability within a translator’s own writing (consistency), and its ability to distinguish one translator from another (separability).

Accordingly, the CVMI-RRMFT process consists of two main stages. In the first stage (CVMI scoring), each of the 37 candidate features is assigned a score. This score, termed the CVMI (Coefficient of Variation and Mutual Information) score, is calculated by a weighted combination of two metrics (see Formula 1 in Appendix A for the full equation):

Coefficient of Variation (CV):

Measures the internal consistency of a feature’s values within the samples of each translator. A lower CV indicates a more stable, habitual use of that linguistic feature.

Mutual Information (MI):

Measures the distinctiveness of a feature’s distribution between different translators. A lower MI indicates the feature’s values are more effective in telling the translators apart.

Features are then ranked by their CVMI scores, with lower scores indicating higher overall discriminative power.

In the second stage (RRMFT redundancy removal), the top-ranked features are further refined to eliminate redundancy. This is achieved by modeling the correlations between features as a “Maximum Feature Tree” and pruning it using a two-neighborhood strategy (detailed steps are provided in Algorithm 1, Appendix A). The final output is a compact, non-redundant set of the most style-indicative features.

Following this procedure, the CVMI-RRMFT algorithm selected the 15 most distinctive features from the initial set of 37. These features, listed in Table 4, form the basis for our subsequent quantitative and qualitative stylistic analysis (Figures 1 and 2).

Table 4.

The 15 Distinctive Features.

Category	Distinctive features
Lexical	Adjective
	Number
	Complex word
	Academic word
	Hyphenated compound word
Syntactic	Average sentence length
Syntactic	Question sentence
Textual	Colon
	Semicolon
	Round bracket
	Square bracket
	Pronoun
	First-person pronoun
	Third-person pronoun
	Future tense sentences

Figure 1.

Comparative performance of selected 15 features.

Figure 2.

CVMI scores of 15 distinctive features.

Experimental Validation

The ten-fold cross-validation was used to evaluate the predictive performance of the classification models. The results are presented in Table 5, Figures 3 and 4.

Table 5.

Results of Experimental Validation.

Classification algorithms	The 37 features			The 15 distinctive features
Classification algorithms	Accuracy	Recall rate	AUC	Accuracy	Recall rate	AUC
Support Vector Machine	94.44%	0.952	0.964	96.00%	0.961	0.962
K-nearest Neighbor	96.00%	0.961	0.962	98.15%	0.981	0.987
Random Forest	98.15%	0.981	0.999	98.15%	0.981	0.999
Naive Bayes	79.63%	0.889	0.825	92.60%	0.955	0.924

Figure 3.

Feature classification in random forest.

Figure 4.

Confusion matrix—53/54 samples correctly classified.

Several classification algorithms were used, including Support Vector Machine (SVM), K-nearest Neighbors, Random Forest, and Naive Bayes. As shown in Table 5, all classifiers using the full set of 37 features achieved satisfactory results, with accuracies exceeding 94%, except for Naive Bayes. Using the 15 selected features significantly improved performance across the three classification metrics. This indicates that classification based on distinctive features leads to highly effective model performance.

Figures 3 and 4 shows the results of the Random Forest classifier based on the 15 selected features. Out of 54 samples, 53 were correctly classified. The confusion matrix below shows that one sample from Li’s translation was misclassified as belonging to Wang’s group. No other misclassifications occurred, underscoring the effectiveness of the selected features. These findings suggest that the 15 selected features are instrumental in distinguishing stylistic nuances among the three translated versions.

Results and Discussion

As a linguistic and social act, translation can be understood through recurrent patterns that reveal a translator’s cultural-ideological positioning and cognitive practice (Baker, 2000). This study employed an integrated qualitative-quantitative approach, which effectively illuminates the nuanced choices and underlying motivations across the three translations. Specifically, the analysis focuses on the 15 most discriminative features selected by the CVMI-RRMFT algorithm, corresponding to three stylistic dimensions—lexical, syntactic, and textual. By tracing the consistent patterns of each translator across these levels, the study demonstrates how style emerges from the interplay among linguistic habitus, strategy, and socio-historical context.

Lexical Features

Adjectives

Adjectives play a crucial yet often overlooked role in technical texts. S. F. Li et al. (1994) emphasized that adjectives serve key functions, such as defining conceptual attributes, delineating usage precisely, making subtle distinctions, and visualizing features. Quantitatively, the proportion of adjectives is comparable in Ren’s (8.09%) and Li’s (7.94%) translations. This similarity suggests that both versions employ adjectives to achieve a nuanced and detailed descriptive style. In contrast, the proportion of adjectives in Wang’s translation is slightly lower (7.59%), reflecting a more concise and direct depiction of facts and phenomena. A concrete example illustrates this stylistic inclination:

Source Text: 凡稻，土脉焦枯，则穗实萧索。勤农粪田，多方以助之。

Ren: The industrious farmer fertilizes his fields and stimulates the [growth of] rice in many ways, for when the soil is lean and depleted the spears of rice become sparse and barren. Commonly used throughout the country as fertilizing agents are: ……

Li: When the soil fertility of a paddy field becomes exhausted, the rice crop, if allowed to grow under such soil conditions, will set seeds sparingly on its panicles. Consequently, diligent farmers take every measure to manure their fields with excrements, in order to improve their soil productivity.

Wang: If the paddy fields are barren, the rice spears and grains will not grow well. The industrious farmers fertilize their fields to stimulate the growth of rice in various ways.

In Ren’s translation, adjectives are employed with notable literary and rhetorical density. Descriptions such as “lean and depleted” and “sparse and barren” create vivid imagery that intensifies the desolate tone of the original. With free translation techiniques, this diction remains evocative, verging on the poetic in its descriptive effect.

Li utilizes adjectives with greater restraint and precision. Words like “exhausted” (for soil fertility) and “sparingly” (for seed setting) function as accurate technical descriptors. With literal translation strategy, the style is factual and expository, prioritizing clear information transfer over stylistic flourish and catering to technical readers.

Wang adopts less adjectival modification. Key concepts are rendered with broad, accessible terms such as “barren” and “not grow well,” effectively conveys the core message but reduces specificity and nuanced imagery found in the original. This reflects a functional translation strategy favoring readability and concision.

Numbers

In natural language, numerical expressions convey vagueness or serve as hyperbole to create an atmospheric effect in literary contexts (Jiang & Li, 1994). In technical translation, however, precision and rigor are paramount. Accurate descriptions supported by specific data are essential for conveying measurements of time, distance, quantity, and other critical parameters. The proportion of numerical expressions is 2.18% in Ren’s version, 2.55% in Li’s, and 1.99% in Wang’s. Notably, Li’s version, with the highest proportion, reflects a strong commitment to precision. Table 6 presents typical examples from the three versions.

Table 6.

Examples of Numerical Translation in Three Versions.

Source text	Ren	Li	Wang
二三尺	several feet	2 to 3 ch’ih	two or three chi
芽发三四个或六七个时	When three, four, or half a dozen or so Shoots have emerged	When 3 to 4 tillers, sometimes as many as 6 to 7,	When four or five buds appear on a piece of sugar cane
一日功劳轻者所获三分，重者倍之°。	……getting between 0.03 and 0.06 ounce [of silver] after a day spent at it,……	One day’s labor may produce 3/100 taels, as a mini mum, or double this, as a maximum.	……getting between 3 and 6 fen of silver after spending a day at it
缠丝分三停，隔七寸许则空一二分不缠,，……	To make a string, a silk thread is tightly wound around a core of more than twenty silk threads in three sections of more than seven ts’un each in length, leaving two gaps each about 0.1 to 0.2 ts’un long.	The winding area on the warp consists of three segments, of about nine ts’un each. Between each segment there is a space of one to two hundredths of a foot unwound, ……	To make a string, a silk thread is tightly wound around a core of more than twenty silk threads in three sections of more than 7 cun each in length, leaving two gaps each about 0.1 to 0.2 cun in length.

Based on the provided examples, the three translators employ distinct yet internally consistent strategies in rendering numerals and classifiers, reflecting varying degrees of cultural mediation and technical precision.

Ren’s translation adopts a domesticated approach that prioritizes cultural adaptation for the target reader. Numerals are frequently converted into Western units (“feet,”“ounce”) or rendered with English idiomatic approximations (“several feet,”“half a dozen or so”), which communicate complex data effectively. This strategy enhances accessibility for a Western audience but generalize values and replaces original units, thereby distancing the translation from the technical context of the original.

Li’s translation pursues documentary precision and fidelity to the source text. It meticulously preserves original numerical ranges through literal rendering (“2 to 3,”“3 to 4… 6 to 7”), and employs fractions like “3/100” where necessary to ensure exactness. Furthermore, it retains traditional Chinese measurement units through Wade–Giles transcription (“ch'ih,”“ts’un,”“taels”). This strategy maintains scholarly rigor and technical accuracy, making the translation especially suited for historical reference and reflecting a factual and historical authenticity. Such commitment to numerical precision not only preserves the textual integrity of the original but also facilitates the effective communication of technological details to the reader.

Wang’s translation maintains original numeral ranges (“two or three,”“3 and 6”), and renders Chinese units with modern Pinyin (“chi,”“cun,”“fen”). However, it occasionally introduces inaccuracies—for example, rendering the original “三四个或六七个” as “four or five.” The keeps the style of clarity and cultural retention of the technical information, but at the expanse of precision.

Academic words and complex words

The vocabulary used in a translation significantly influences its complexity. To assess lexical difficulty, this study employed AntWordProfiler 2.2.1, a tool designed to profile vocabulary levels. The reference word lists used in this analysis included the General Service List (West, 1953), the First and Second High-Frequency Word Lists, and the Academic Word List (Coxhead, 2000). Words not found in these lists are categorized as “off-list,” indicating that they constitute complex vocabulary requiring higher cognitive effort for comprehension.

In technical translation, precision and formality often take precedence, which encourages the use of less frequent but semantically unambiguous terms. This reflects the necessary balance between fidelity and accessibility in specialized texts. As shown in Figure 5, all three translations employ substantial technical terminology to maintain academic rigor—underscoring the role of specialized vocabulary in ensuring precision—yet their individual lexical choices differ markedly: Ren uses the most academic terms, Li the most complex vocabulary, and Wang the fewest of both. For instance, in rendering the original phrase “老足者，喉下两夹通明,” Ren translates “喉” as “throat,” Li as “pharynx,” and Wang as “cheek.” These divergent choices—from the general (“throat”) to the scientifically specific (“pharynx”) to the colloquial (“cheek”)—exemplify the fundamental tension in technical translation between terminological precision and interpretive flexibility, a balance each translator navigates differently based on their assessment of the text’s purpose and audience.

Figure 5.

The proportion of academic words and complex words.

To further assess lexical difficulty and translators’ word preferences, this study employed the Level Vocab Checker, developed by Japanese Professor Dale Brown in 2019, which utilizes the JACET 8000 Word List to grade vocabulary on a scale from 1 (easiest) to 20 (most difficult).

Vocabulary at levels 1 to 10 accounts a similar proportion across all three translations (Ren: 40.8%; Li: 40.4%; Wang: 40.6%), indicating little variation in the use of basic words. As shown in Figure 6, Words from levels 11 to 17 represent the largest share in Wang’s translation, whereas levels 18 to 20 constitute a significantly larger proportion in both Ren’s and Li’s versions. Overall, Ren and Li use more complex vocabulary, suggesting a translation aimed at a professional readership and presenting a greater cognitive challenge. In contrast, Wang’s version is lexically more accessible.

Figure 6.

The word difficulty levels in three translations.

Lexically, Ren’s version is notably complex, characterized by rich vocabulary and detailed descriptions. This approach align with a scholarly audience and is supported by the expertise of her spouse, a specialist in mining and metallurgy. The strategy manifests in her frequent use of adjectives and precise numerical descriptions, which articulate scientific concepts with accuracy—a tendency also reflected in her extensive use of annotations (not analyzed here). Li’s version exhibits comparable lexical complexity, marked by a high proportion of adjectives, exact numerical expressions, and academic terminology that clarifies scientific phenomena. His background as a chemist informs this preference for precision, ensuring conceptual clarity. In contrast, Wang’s translation employs more accessible vocabulary, consistent with its place in a state-sponsored publication series. This simplification functions as a deliberate instrument of cultural diplomacy to promote China’s heritage globally.

Syntactic Features

Sentence Length

Average sentence length, calculated as total words divided by total sentences often correlates with formality and structural complexity, reflecting proficiency in language use (Jiang, 2014). Following Butler’s (1985) classification—short (1–9 words), medium (10–25 words), and long sentences (over 25 words)—this study identified sentence boundaries using periods, exclamation points, and question marks, consistent with WordSmith 8.0 guidelines.

As shown in Table 7, all three translations contain more sentences than the source text, aligning with the tendency of translated texts to enhance clarity. Each version falls within the medium-length range (10–25 words). Notably, Ren’s translation has the fewest sentences but the longest average length, followed by Li’s version; Wang’s uses relatively shorter sentences. When including Li’s 1,538 traditional Chinese characters, his average sentence length increases further, underscoring his distinctive approach to textual expansion.

Table 7.

Sentence Statistics in Three Translations.

Sentence statistics	Ren	Li	Wang	Source text
Sentences count	3,180	3,930	3,547	2,675
Average sentence length	22.43	18.48	18.36	21.54

Sentence Structures

In addition to sentence length, the complexity of internal sentence structure is a key dimension for assessing a translator’s style. To further investigate the syntactic differences among the three translated versions—specifically, whether longer sentences correspond to more complex syntactic constructions—this study employed the L2 Syntactic Complexity Analyzer developed by Lu (2010) to extract 11 metrics across four dimensions, thereby systematically quantifying syntactic complexity. The results, presented in Table 8, reveal notable syntactic differences among the three translations, providing a quantitative basis for subsequent stylistic comparison.

Table 8.

Sentence Complexity Statistics in Three Translations.

Dimension	Indicator	Ren	Li	Wang	p Value
Dimension	Indicator	M (SD)	M (SD)	M (SD)	p Value
Overall sentence complexity	clauses per sentence (C/S)	1.98 (0.08)	1.54 (0.14)	1.42 (0.06)	.000***
Amount of subordination	clauses per T-unit (C/T)	1.52 (0.08)	1.37 (0.06)	1.40 (0.11)	.000***
	clauses per T-unit (CT/T)	0.39 (0.05)	0.30 (0.05)	0.35 (0.06)	.000***
	dependent clauses per clause (DC/C)	0.31 (0.03)	0.26 (0.04)	0.30 (0.04)	.001**
	dependent clauses per T-unit (DC/T)	0.47 (0.06)	0.35 (0.07)	0.42 (0.09)	.000***
Amount of coordination	coordinate phrases per clause (CP/C)	2.06 (0.11)	0.38 (0.07)	0.37 (0.06)	.703
	coordinate phrases per T-unit (CP/T)	0.58 (0.09)	0.52 (0.09)	0.51 (0.07)	.022
	T-units per sentence (T/S)	1.13 (0.09)	1.04 (0.05)	1.11 (0.06)	.011
Degree of phrasal sophistication	complex nominals per clause (CN/C)	1.36 (0.11)	1.29 (0.15)	1.19 (0.14)	.003**
	complex nominals per T-unit (CN/T)	2.06 (0.11)	1.76 (0.22)	1.66 (0.18)	.000***
	verb phrases per T-unit (VP/T)	1.98 (0.08)	1.89 (0.11)	1.83 (0.08)	.000***

Note. Asterisks indicate statistical significance levels as **p < 0.01 and ***p < 0.001.

A one-way analysis of variance (ANOVA) was conducted to identify significant differences in these syntactic complexity metrics. As shown in Table 8, significant differences were found in the dimension of “Overall Sentence Complexity,”“Amount of Subordination,” and “Degree of Phrasal Sophistication” (p < .01), confirming distinct syntactic profiles across the three versions. Specifically, mean values for “Overall Sentence Complexity” and “Degree of Phrasal Sophistication” decreased consistently across Ren’s, Li’s, and Wang’s translations. In Notably, “Amount of subordination,” Ren’s translation scored the highest, followed by Wang’s, while Li’s showed the lowest mean value. In contrast, differences in “Amount of Coordination” were not statistically significant, indicating similar use of coordinative structures across the versions.

These syntactic differences correlate with the perceived readability of the translations. Ortega (2003) observed that lower-level language learners tend to rely more on dependent clauses, whereas advanced learners employ more complex phrases. This suggests that texts rich in complex phrases generally pose greater reading challenges than those with frequent subordinate clauses. Accordingly, Li’s higher phrasal sophistication indicates a potentially higher reading difficulty compared to Wang’s version. It also illustrates that the proportion of subordinate clauses alone does not uniformly predict text difficulty. In the case of Tian Gong Kai Wu, syntactic complexity shows a positive relationship with average sentence length: complexity increases as sentences grow longer.

This correlation implies how translators’ syntactic choices shape stylistic effect and readability. Examining these features can therefore offer key insights into how translation strategies influence the effectiveness of technical communication and modulate readers’ cognitive load. The following excerpt illustrates how sentence structure is handled differently by each translator:

Source Text: 待夏秋之交，南风大起，则一宵结成，名曰颗盐，即古志所谓大盐也°。

Ren: (After a period the brine will turn red), until late summer or early autumn when the south wind blows hard across this area, so that the [brine evaporates entirely and] salt crystallizes overnight [Figure 5.5]. This salt is called “grain salt,” also termed “large salt” in ancient records, because while sea salt is finely granulated, this lake salt is formed in big crystals, hence the term “large.”

Li: At the turn of Summer into Fall, the south wind blows across the pond, the water evaporates, and salt crystallizes overnight. It is named “grain salt” (顆塩) and was called “Large Salt” (大塩), in the older literature. It derives its name, “Large Salt,” from the fact that its crystals are large, when compared with the fine crystals of salt obtained by evaporation of sea water.

Wang: Toward late summer or early autumn when the south wind blows hard across this area, the brine evaporates entirely and salt crystallizes overnight, which is called “grain salt,” also termed “large salt” in ancient books, because this lake salt is formed in big crystals, hence the term “large” while sea salt is finely granulated.

Ren’s translation reconstructs the source text into a single, densely layered sentence through nested clauses (“until… when… so that…”), which meticulously weave together temporal, conditional, and causal relationships. Characterized by intricate syntactic framework and a final explanatory “because” clause, Ren’s translation exhibits a high syntactic complexity, which effectively accommodates detailed academic information and explicit logic, making it tailored for readers seeking deeper contextual and technical understanding.

Li’s translation employs a measured use of coordination and a higher Degree of Phrasal Sophistication (CN/T). He breaks the whole sentence into short, declarative statements, adopting a paratactic flow rather than intricate subordination. The process is rendered as a clear sequence linked by “and.” Notably, he introduces a separate defining sentence (“It derives its name… from the fact that…”). This method avoids lengthy embedded clauses, conveying precise information through compact noun phrases and independent explanatory statements. Overall, it reflects a scientific orientation that prioritizes factual clarity, terminological precision, and documentary value.

Wang’s translation employs a balanced use of subordination to enhance accessibility. It follows a clear main-subordinate structure, opening with a “When” adverbial clause, extending through a “which” relative clause, and concluding with a “because” causal clause. This framework is more linear and predictable than Ren’s, with logical connectors used transparently to lower processing effort. Such syntactic streamlining with lower lexical and syntactic complexity facilitates comprehension for a broader readership, aligning with the translation’s aim of cultural dissemination and reader-friendly adaptation.

These findings reveal that Ren’s translation exhibits the highest syntactic complexity, a feature closely tied to her background as a Chinese-American scholar, which enables her to skillfully navigate cross-cultural nuances through sophisticated vocabulary and multi-layered sentences. In contrast, Li employs condensed phrasal structures to convey information, an approach rooted in his expertise in chemistry and prior experience translating The Chemical Arts of Old China, which afforded him a thorough understanding of Tian Gong Kai Wu (Y. M. Wang, 2022). His translation thus function as a key historical resource for studying China’s technological development. Wang’s version, produced under a state-sponsored project, prioritizes readability to facilitate broad international dissemination.

Textual Features

Referents For Textual Cohesion

English relies on pronouns to maintain referential clarity, which directly enhances text readability. In the translations, pronouns constitute 4.85% of Ren’s text, 5.32% of Li’s, and 5.20% of Wang’s. The higher proportions in Li’s and Wang’s versions suggest a greater focus on clarifying logical relations through pronominal reference.

As Halliday and Hasan (2014) note, third-person pronouns are particularly important for textual cohesion, serving anaphoric and cataphoric functions that aid continuity in technical discourse. As illustrated in Table 9, Ren’s frequent use of third-person pronouns reflects this cohesive emphasis, helping guide readers through complex technical descriptions. Furthermore, among personal pronouns, “it” occurs most frequently, representing 0.58% of Ren’s text, 0.63% of Li’s, and 0.81% of Wang’s. Generally, “it” signals a deliberate stylistic preference for objectivity (Biber, 1999). Thus, the high frequency of “it” particularly in Wang’s translation reflects a tone consistent with a more formal textual style. This pattern highlights how pronouns selection contributes to textual cohesion and clarity, and enhances the perceived credibility of the translator’s voice.

Table 9.

Statistics of Textual Features in Three Translations.

Features	Ren	Li	Wang
Personal pronouns
First-person	45	63	38
First-person	0.05%	0.08%	0.04%
Second-person	2	2	16
Second-person	<0.01%	<0.01%	0.02%
Third-person	2,788	1,381	1,209
Third-person	3.91%	1.90%	1.86%
Total	(<) 3.96%	(<) 1.98%	1.92%
Punctuation
Semicolon proportion	3.17%	1.03%	1.76%
Colon proportion	0.90%	0.68%	0.58%
Punctuation
Square brackets (pair)	1,226	0	3
Round brackets (pair)	235	1,135	214
Total	15.05%	9.84%	3.12%

The continuous evolution of technology reshapes the language and expressive modes of science. While conventional translation practice often emphasizes objectivity through third-person or impersonal structures, such an approach can reduce textual vividness and hinder the cross-cultural reception of classical Chinese works (M. W. Xu & Li, 2021). Conversely, a growing trend among English-speaking professionals favors the first-person perspective to enliven technical writing (Fan & Liu, 2025). In translating Tian Gong Kai Wu, however, the use of first-person pronouns is constrained by the source text, which contains only a limited number of the author’s subjective expressions—notably, the characters “吾” (wu) and “我” (wo) appear just ten times. Among three translators, Li employs first-person pronouns most frequently, suggesting a deliberate strategy to enhance reader engagement and textual immediacy. This illustrates that first-person pronouns can serve as vital communicative resources, helping to bridge cultural distance and connect with target-language readers—a stylistic choice that may also align with contemporary preferences in reading technical texts.

Punctuation For Textual Expression

Semicolons and Colons

Lyu and Zhu (2013) assert that “each punctuation mark has its unique function, and it is not an exaggeration to state that they represent another form of functional units.” The proper application of punctuation marks is a strategic translation method that has received little attention in previous studies.

The selection of punctuation in translation directly shapes sentence structure and flow. For example, a semicolon connects clauses or sentences more tightly than a period (Q. Wang, 2023), sustaining continuity across ideas, whereas a colon introduces elaborations, lists or elaborations that clarify what precedes. In technical contexts, where precision and logical progression are essential, the deliberate use of semicolons and colons ensures organize complex information, guiding readers clearly through the message. Thus, these punctuation choices not only strengthen textual cohesion but also aid comprehension in translated technical texts.

As shown in Table 9, all three translations use semicolons and colons more frequently than the source text. In Ren’s translation colons are consistently employed to introduce explanations, enumerations, or procedural steps—such as “Jade has only two colors: white and green” and “The method of planting sesame is as follows: make farm plots….” This strategy enhances textual clarity, aids in restructuring complex technical descriptions, and improves overall readability.

Semicolons are commonly used in English to coordinate ideas, indicate pauses, and distinguish layers of content (Kong, 2023). Notably, Ren’s version uses semicolons approximately two to three times more frequently than the other two versions, as illustrated in the following example:

Source Text: 珠玉、宝石受月华，不受土寸掩盖。宝石在井上透碧空，珠在重渊，玉在峻滩，但受空明、水色盖上°。

Ren: The gems in their pits exposed to the blue skies above; pearls, denizens of deep waters; and jade, existing beside precipitous streams; all need only the covering of atmosphere and water.

Li: Precious stones, in their hollows, reach out into the air to absorb the refined magnificence of celestial planets. Pearls, in deep seas, and jade, under lofty rapids, acquire their resplendence from the heavenly orb and the shimmering oceans.

Wang: Gem stones are produced in mines open to air, pearls in deep water, and jade in riversides along rapid rivers. All need only the covering of air and water.

In the source text, two separate sentences are linked by two periods, whereas Ren’s version connects them with two semicolons, creating a more fluid and cohesive relationship. For instance, one semicolon joins “pearls” (珠) and “jade” (玉), illustrating how this punctuation can link not only full clauses but also parallel phrases based on semantic ties—functioning similarly to a comma in such contexts. Such flexible use of punctuation enhances the translations’ fluency, coherence, and rhythm. By contrast, Li’s version closely follows the original punctuation structure, while Wang’s reorganized sentence segments according to meaning, relying primarily on commas and periods to reflect the style of the source.

Square Brackets and Round Brackets

As shown in Table 9, Ren’s translation makes extensive use of square brackets. Examples such as “The mast is the arc of the bow [for shooting arrows],”“When a mussel [or an oyster] conceives a pearl,”and “the bamboo sections [gunwale?] are the tiles [atop the wall?],” illustrate how brackets perform multiple functions: clarifying technical details, signaling cultural or referential ambiguities. With over a thousand instances, these brackets primarily supply explanatory notes on vague or culturally specific references, thereby reducing sentence complexity and minimizing ambiguity. Furthermore, this innovative and frequent use of punctuation exemplifies a “thick translation” approach (Appiah, 1993), in which the translator embeds the source text within a framework of cultural and technical commentary. As a historian addressing a Western academic readership, Ren employs brackets to mediate between the classical Chinese technical discourse and the expectations of scholarly readers seeking deeper understanding. Consequently, her translation has become an authoritative reference for researchers studying China’s history of science and technology (Ho, 1967).

Li’s extensive use of round brackets with traditional characters is a politically nuanced form of thick translation, reinforcing cultural identity through visual and lexical means for a sinological audience. For instance, “稻灾” is translated as “RICE CROP DAMAGE (稻災).” This distinctive approach visually preserves the source text’s lexicon, a unique style rarely seen in conventional translations, offering readers direct access to the original terminology within the technical description. Fundamentally, this distinctive stylistic choice was ideological rather than merely technical. It stemmed from Li Ch’iao-ping’s patriotic sentiment from the Second Sino-Japanese War, and his mission to advance Chinese scientific civilization for Western sinologists and learners through translation. His strategic inclusion of more than 1,500 traditional Chinese characters within round brackets represents a deliberate act of cultural preservation, a strategy surpasses conventional translation to become a pronounced statement of cultural identity. Endorsed by Taiwan’s official publishing house within its specific historical context, he imbues his work with an archaic quality evocative of classical Chinese texts.

Faithfully mirroring the source text’s 245 round brackets, Wang’s translation exhibits a clear preference for simplification through easier vocabulary, shorter sentences, and lighter punctuation, collectively making it the most readable version. This stylistic choice, facilitated by a team proficient in both languages and subject matter, functions as a form of cultural diplomacy. It is fundamentally shaped by the macro-agenda of nation branding, which frames the translator as a diplomatic envoy. The ultimate aim is to enhance the global accessibility of the ancient classic, thereby supporting the twenty-first-century strategic goal of promoting Chinese culture abroad.

Limitations and Future Work

This study proposes a framework for analyzing translator style in technical classics, yet limitations should be noted. First, its focus on a single classic and three translations limits the generalizability of the findings. Second, the stability of the selected features should be tested on larger and more diverse corpora. Third, both the quantitative features and qualitative interpretations require further validation—the former on independent datasets to confirm broader applicability, and the latter through methods such as coder triangulation to minimize interpretive bias. Future work should therefore expand to multi-genre corpora, validate features across independent datasets, and incorporate systematic checks to strengthen the reliability of qualitative stylistic analysis.

Conclusion

By employing a corpus-based, machine-learning approach, this study has elucidated how translators communicate the Chinese technical classic Tian Gong Kai Wu, successfully addressing the three research questions and thereby revealing the intricate nature of translator style.

First, through the application of the CVMI-RRMFT feature selection algorithm, 15 highly discriminative linguistic features were identified across the three English translations (Question 1). These features, distributed across lexical (e.g., adjectives, numbers), syntactic (e.g., mean sentence length), and textual levels (e.g., punctuation, pronouns), provide an objective and quantifiable basis for distinguishing the translators’ stylistic fingerprints.

Second, analysis based on these distinctive features reveals significant differences in the translators’ styles, manifesting both within the texts and reflecting external influences (Question 2). The styles are systematic manifestations of socio-cultural positioning. Ren’s translation is characterized by a scholarly “thick translation” approach, evident in its complex sentence structures, sophisticated vocabulary, and prolific use of square brackets for annotations, underscoring her role as a historian translating for an academic readership. Li’s precision and cultural coding reflect his dual role as chemist and cultural guardian. This version emphasizes technical precision and cultural identity, marked by a high frequency of numerical expressions, specialized terminology, and the unique incorporation of traditional Chinese characters within round brackets, reflecting his background as a chemist and his cultural motivations. In contrast, Wang’s translation style is fundamentally a diplomatic mode, shaped by state patronage for global nation branding. It prioritizes accessibility and readability, featuring simplified vocabulary, shorter sentences, and a streamlined use of punctuation, aligning with its goal of promoting Chinese culture to a global general audience in the 21st century.

Third, the causes are strategic. Style emerges from the interplay of habitus and socio-historical constraints (Question 3). Internal factors, such as the translators’ expertise, professional identities, and personal translation philosophies, directly shaped their linguistic choices. Simultaneously, external factors—including target readership expectations, publishing sponsorship, historical context, and broader cultural-political agendas—significantly influenced their strategic adaptations. Linguistic choices strategically embody their perceived role—scholarly mediator, cultural guardian, or diplomatic envoy—within specific networks, a conscious negotiation between individuality and context. Therefore, this interplay confirms that translator style is not merely an unconscious imprint but a strategic negotiation, a dialectic between the translator’s individuality and the contextual milieu.

This study establishes a robust framework that integrates computational and qualitative analysis to uncover subtle stylistic patterns in translation, patterns often missed by purely subjective interpretation. By quantifying stylistic features and contextualizing them within professional and historical milieus, it provides a nuanced understanding of how technical classics are mediated cross-culturally. Future research should incorporate paratexts—images, annotations, explanatory notes—to explore how multimodal factors shape translation. Consequently, the work not only deepens our insight into the transmission of a Chinese classic but also demonstrates the significant potential of data-driven methods in the humanities.

Footnotes

Appendix

Acknowledgements

The Key Program of the National Social Science Fund of China is gratefully acknowledged.

ORCID iDs

Wenjing Lv

Jinquan Wang

Ethical Considerations

This study, being a textual analysis of published translations, required no ethical approval as it involved no interaction with human or animal subjects. The research complies with all applicable institutional policies.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the Key Program of the National Social Science Fund of China (Grant No. 24AYY019).

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

Data will be made avaiable upon request from the corresponding author.

References

Appiah

K. A.

(1993). Thick translation. Caloo, 16(4), 801–819.

Baker

(1993). Corpus linguistics and translation studies: Implications and applications. In Baker

Francis

Tognini-Bonelli

(Eds.), Text and technology: In honour of John Sinclair (pp. 233–250). John Benjamins.

Baker

(2000). Towards a methodology for investigating the style of a literary translator. International Journal of Translation Studies, 12(2), 241–266.

Biber

Johansson

Leech

Conrad

(1999). Longman grammar of spoken and written English. Longman.

Bodde

(1967). T’ien-kung K’ai-wu: Chinese technology in the seventeenth century. Annals of the American Academy of Political and Social Science.

Bourdieu

(1986). The forms of capital. In Richardson

J. G.

(Ed.), Handbook of theory and research for the sociology of education (pp. 241–260). Greenwood Press.

Bourdieu

(1990). The logic of practice. Polity Press.

Butler

(1985). Studies in linguistics. Longman Group Limited.

Chan

(1968). T’ien-kung K’ai-wu: Chinese technology in the seventeenth century. Monumenta Serica.

10.

Coxhead

(2000). A new academic word list. TESOL Quarterly, 34(2), 213–238.

11.

Fan

W. Q.

Liu

Y. Z.

(2025). Research on scientific and technical translation in the information age: Challenges and strategies. Foreign Language Education, 46(02), 69–75.

12.

Halliday

M. A. K.

Hasan

(2014). Cohesion in English. Routledge.

13.

Hermans

(1996). The translator’s voice in translated narrative. International Journal of Translation Studies, 8(1), 23–48.

14.

P. Y.

(1967). T’ien-Kung K’ai-Wu: Chinese technology in the seventeenth century. Harvard Journal of Asiatic Studies.

15.

K. B.

Xie

L. X.

(2017). A corpus-based study of translator’s style: Connotation and methodology. Chinese Translators Journal, 38(2), 12–18, 128.

16.

Jiang

(2014). A quantitative linguistic comparison of human translations and automatic online translations in terms of stylistic features. Foreign Language Education, 35(5), 98–102.

17.

Jiang

Z. J.

(1994). Precision amid ambiguity: The ambiguity reflected by quantity words. Foreign Language Research (Journal of Heilongjiang University), 15(2), 46–49.

18.

Kong

D. L.

(2023). A machine-learning-based comparative study of translation styles in three English versions of Romance of the Three Kingdoms. Journal of Dalian University, 44(4), 38–47.

19.

Leech

Short

(1981). Style in fiction: A linguistic introduction to English fictional prose. Pearson.

20.

D. F.

W. Z

Hou

L. P.

(2018). A corpus-assisted study of Julia Lovell’s translation style. Foreign Language Education, 39(01): 70–76.

21.

S. F.

Zhang

Z. J.

(1994). On the translation of adjectives in the literary context of science and technology. Chinese Translators Journal, 15(1), 13–15+9.

22.

Lin

Z. H.

Wang

(2021). A study on English translation of T’ien-kung K’ai-wu: Progress and prospects. Journal of Nanjing Institute of Technology (Social Science Edition), 21(4), 1–7.

23.

Lin

Z. H.

Wang

(2022). A study of the translator’s behavior criticism on the “misinterpretation” in three English translations of Tian Gong Kai Wu. Shanghai Journal of Translators, 37(6), 73–79.

24.

Liu

Han

Z. M.

(2025). A study on translator fingerprinting in Hong Lou Meng based on K-means clustering. Journal of Xi’an International Studies University, 33(02): 94–101.

25.

X. F.

(2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4), 474–496.

26.

Luo

J. H.

(2011) The influence of mode of thinking on English-Chinese technical text translation from the perspective of functional translation theory. Foreign Languages and Literature, 27(S1): 54–57.

27.

Lynch

Vogel

(2018). The translator’s visibility: Detecting translatorial fingerprints in contemporaneous parallel translations. Computer Speech & Language, 52(6), 79–104.

28.

Lyu

S. X.

Zhu

D. X.

(2013). A talk on grammatical rhetoric. The Commercial Press.

29.

Malmkjær

(2003). What happened to God and the Angels:An exercise in translational stylistics. Target, 15(1), 37–58.

30.

Mastropierro

(2018). Corpus stylistics in heart of darkness and its Italian translations. Bloomsbury Publishing.

31.

Ortega

(2003). Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis. Applied Linguistics, 24(4), 492–518.

32.

Saldanha

(2011). Translator style: Methodological considerations. The Translator, 17(1), 25–50.

33.

Simeoni

(1998). The pivotal status of the translator’s habitus. Target, 10(1): 1–39.

34.

Wang

(2023). The skillful transformation of punctuation in the English translation of information technology. Journal of Tonghua Normal University, 44(3), 86–91.

35.

Wang

Y. M.

M. W.

(2018). On translation strategies of “Tian” in Tian Gong Kai Wu in the Library of Chinese Classics. Journal of Xi’an International Studies University, 26(2), 94–98.

36.

Wang

Y. M.

(2022). Exploring the English translation methods of Ancient Chinese cultural scientific and technological terms based on “Tian Gong Kai Wu.” Chinese Translators Journal, 43(02), 156–163.

37.

West

(1953). A general service list of English words with semantic frequencies and a supplementary word-list for the writing of popular science and technology. Longmans, Green.

38.

Wijitsopon

(2022). Corpus stylistics and color symbolism in The Great Gatsby and its Thai translations. Language and Literature, 31(3), 267–295.

39.

H. F.

Zhang

Liu

Lyu

D. J.

(2021). Feature selection method based on coefficient of variation and maximum feature tree. Journal of Nanjing Normal University (Natural Science Edition), 44(1), 111–118.

40.

M. W.

(2021). The translation of Chinese classics of science and technology: Problems, methods, and techniques. Interdisciplinary Studies in Translation, 1(00), 193–209.

41.

Yang

W. Z.

(1991). On the original meaning of “Tian Gong Kai Wu” and its epistemological value. Journal of Sun Yat-sen University (Social Science Edition), 31(2), 47–53.

42.

W. G.

(2023). Meaning transmission with images and appropriate thick translation: Research on C-E translation strategies of sci-tech classic “Tian Gong Kai Wu.” Chinese Science & Technology Translators Journal, 36(2), 54–57.

43.

Zhan

J. H.

Jiang

(2016). The application of linguistic quantitative features in translator identification: A case study of two Chinese translations of pride and prejudice. Foreign Language Research, 37(03), 95–101.

44.

Zhao

Q. Z.

(1997). Li Ch’iao-ping’s translation process of “Tian Gong Kai Wu.” China Historical Materials of Science and Technology, 18(3), 90–94.

45.

Zhu

C. W.

R. F.

(2023). A corpus-based analysis of Ezra Pound’s translator style in English translations of Chinese classics. Foreign Language Education, 44(04), 75–82.