Sage Journals: Discover world-class research

Abstract

Despite the immense influence of machine translation (MT) on cross-cultural communication worldwide, little is known about end users’ predispositions toward MT. Our online experiment (N = 284) compares people’s perceptions of MT and human translation in an ethically charged situation, in which the translation serves an immigrant worker in an interaction defined by power imbalance. Using hierarchical linear regression, we found that an otherwise identical translation was evaluated differently when it was attributed to MT or human translation. Results reveal that translators and non-translators alike exhibit a negative bias toward the MT product when asked to assess its accuracy and reliability, its ability to convey cultural and emotional otherness, and its potential effectiveness in helping the disadvantaged immigrant in need of the translation. We also demonstrate how lower evaluations of the MT product lead to a stronger wish to intervene in the translation by introducing changes to the original message. Our results suggest that predispositions toward MT must be taken into account in any consideration of MT-mediated communication, as these predispositions may shape the communicative act itself.

Keywords

Human perceptions of MT lay use of MT machine translation (MT)MT and cross-cultural communication MT in ethically charged communication social implications of MT

Introduction

Boosted by globalization and the fast-growing developments of the digital era, communication across languages and cultures via translation has proliferated. The ubiquity of translation is not limited to a particular domain of social life: there has been a dramatic rise in commercial translation and localization as well as multilingual communication in individual, professional, and institutional settings (Cronin, 2012). While the increased volume and accessibility of translation has stemmed from various technological advances, perhaps no development has been as crucial as machine translation (MT); in particular, freely available online systems, such as Google Translate (Google, 2016). The technology’s high availability and synchronicity has extended the use and influence of MT to everyday situations typical to multilingual societies, which carry important social and ethical implications (Vieira et al., 2020). However, the common attitudes toward MT and the heuristics involved in evaluating its product remain largely unknown.

MT refers to fully automated software that can translate a source text in almost any language, into almost any target language of the user’s choice. The last few years have seen MT quality increase to a level that some scholars claim is on a par with human translation, in certain texts and between certain languages (Hassan et al., 2018). While there is no consensus about the validity of this claim, or more broadly about the criteria used to measure MT quality (Pym, 2020a; Toral et al., 2019), few doubt that MT technology has greatly improved in recent years. This improvement has made MT one of the more pervasive cases of Artificial Intelligence–Mediated Communication (AI-MC), conceptualized as “interpersonal communication that is not simply transmitted by technology, but modified, augmented, or even generated by a computational agent to achieve communication goals” (Hancock et al., 2020, 90). MT replaces in full a source text of whatever length with an equivalent text in the target language, and does so autonomously with respect to the myriad semantic, syntactic and morphological choices involved in the production of the translated text. While today’s data-driven algorithms perform a statistical analysis of large corpora of existing human translations, how MT actually arrives at its translation decisions remains largely uninterpretable even to its programmers, on account of its deep learning approach (Melby, 2020: 430–433). MT thus joins the automation of other aspects of communication which were deemed until not too long ago quintessentially and exclusively human activities, as generally opaque AI-MC algorithms are now able to communicate emotions and perform acts that have moral implications (Hohenstein and Jung, 2020; Yao et al., 2017).

Despite the immense influence of MT on communication across languages around the world, and the burgeoning research on automation, scholars have not yet examined how end users preconceive and evaluate MT, in comparison with human translation, through a sociological and psychological rather than technical-linguistic lens. While computer scientists have concentrated on evaluating the quality of the MT product in terms of its semantic and syntactic properties (Toral et al., 2019), sociological approaches tend to focus on translators’ attitudes to MT much more than on the attitudes of (non-translator) end users of MT. However, studies have shown that end users’ perspectives and practice are highly conditioned by the digital media of communication, including by AI-MC technologies and algorithms (Sundar et al., 2015). People’s preconceptions and trust of algorithms may influence, for better or worse, their evaluations of an algorithm’s effectiveness, authenticity, fairness, trustworthiness, and aesthetic value (Jago, 2019; Lee, 2018). Indeed, with respect to the use of algorithms, human perceptions may play just as an important role as the actual quality of the AI product (Dietvorst et al., 2015).

This article draws on the insights and methodology of recent automation studies to examine people’s preconceptions and evaluations of MT in comparison with human translation, in an ethically charged situation. In the last decades, translation scholars have shown how translation, as an act of communication across nations and cultures, is often governed by power relations and may have important ethical implications (Tymoczko and Gentzler, 2002). No longer understood merely in terms of linguistic equivalence, translation is now seen through an ethical lens as scholars emphasize how translation may reflect and influence individuals’ and groups’ social position and accessibility, cultural representation, cooperation, and trust (Pym, 2012). Translation in migration and immigration contexts is a particular case in point (Inghilleri, 2017). The rise in the quality of MT and the ubiquity of its use through freely available systems online have made ethical dimensions relevant to MT use as well. As with other instances of AI-MC, MT is already used in situations of power imbalance between communities and languages, and the use of MT in such situations is only likely to increase—making it all the more important to understand people’s preconceptions of MT, as these may influence how it is actually used and perceived, and what are its social implications.

This article aims to contribute to a better understanding of AI-mediated acts of communication, by asking how people perceive and evaluate a software-produced translation. We show that professional translators and non-translator users alike evaluate differently an otherwise identical translation when it is attributed to machine or to human translation. In particular, we wished to know how people evaluate the features of an MT product in an ethically charged situation that involves a disadvantaged immigrant worker in need of the translation. To this end, we presented participants with a scenario where an immigrant worker was wronged and needed to have his official complaint translated, and examined if people evaluated differently, in each condition (human translator/MT), the translation’s accuracy and reliability, its ability to convey cultural and emotional otherness, and its potential functional effectiveness in helping the plight of the immigrant in need of the translation. Our results reveal that people exhibit a negative bias to the MT product and assign it with lower evaluations when asked to assess these features of the translation. We also show the role played by people’s expectations of the quality of the translation they were to receive: how the degree of positive surprise from the MT product, resulting from prior expectations for lower quality, influences the eventual outcome of people’s evaluations. Finally, we demonstrate how the lower evaluations for the MT product lead to a stronger wish to introduce changes to the translated text, at the expense of faithfulness to the original message, to help the disadvantaged person in need of the translation. Our results suggest that the existing predispositions toward MT must be taken into account in any consideration of MT-mediated communication that has ethical and social implications, as these predispositions may shape the communicative act itself.

Literature Review

Machine Translation

MT exists since the 1950s, but it was the paradigmatic shift from rule-based MT to statistical MT in the last 20 years, and particularly the recent advent of deep learning, neural machine translation (NMT), that raised its quality level immensely (Melby, 2020). Unlike other developments, such as computer-assisted translation (CAT) technologies, NMT is a proverbial black box that lacks algorithmic transparency. It draws on very large parallel corpora of existing human translations paired with their source text segments (usually at the sentence level), using a deep learning approach to determine probabilities of translation outputs by means of complex recursive neural networks (Forcada, 2017). For MT’s end users, however, the inner workings of the algorithm are not only impenetrable, but also largely irrelevant. MT receives as an input a source text and provides as an output its translation in another language without any human decision-making in any part of the computational process. As such, MT does not specifically serve professional translators. Freely available MT systems are accessible online for all in developed and developing countries alike. With close to 150 billion words being processed by Google Translate every day (Davenport, 2018), it is clear that MT’s social implications are not limited to the translation community, but have an increasing effect on multilingual communication in a globalized world.

Studies on the perceptions and social implications of MT have thus far concentrated on professional translators’ perspectives on MT as a threat posed to their industry (Vieira, 2020a); the conditions under which translators tend to adopt or reject MT for their work (Cadwell et al., 2018); the transformation in translators’ practices, self-perception and professional status, wrought by the advances of MT (Läubli and Orrego-Carmona, 2017; Sakamoto, 2019); among others. Past research largely attests to translators’ apprehension and lower appreciation for the MT product in terms of quality and usefulness, though some of the resistance decreases when translator agency is retained at the stage of software development and implementation (Rossi and Chevrot, 2019). As MT becomes widely used by non-translators in various social contexts of communication, the strong emphasis on professional translators’ attitudes to MT limits our understanding of broader social effects of this technology. It therefore seems crucial to take both translator and non-translator populations into account when exploring people’s perceptions of MT—particularly in circumstances in which ethical aspects of MT use are salient.

The last decades have seen much scholarly focus on translation in situations of power imbalance, shifting the spotlight from translation’s linguistic and stylistic features to the ideological factors that determine its function and politics (Hermans, 2009). More recently, a growing number of studies have discussed the changing reality of translation in a new globalized, digital era (O’Hagan, 2020). Most relevant to this article, researchers have begun to explore the practical and moral implications of using MT in situations in which it had not been employed before. Studies have described the use, or problematics of use, of MT in close-captioning local television channels for immigrant communities (Hutchins, 2009: 18–19), in increasing accessibility of underserved groups in contexts of civic participation and public health (Bouillon et al., 2017; Nurminen and Koponen, 2020), in making institutions, such as the public library, accessible for immigrant newcomers (Bowker and Buitrago Ciro, 2015), or in pressing medical and legal circumstances (Vieira et al., 2020). As implied in these studies, using MT to address the communicative needs of immigrants and members of disadvantaged communities may have important social and moral implications.

However, as shown in recent automation studies, the actual effects of artificial intelligence on its human users are not an inevitable given, particularly in cases of social interaction. These effects do not depend only on an algorithm’s usefulness or ease of use, but are rather contingent upon human perceptions and attitudes toward algorithms, such as expectations and trust (Glikson and Woolley, 2020). The fact that MT is a major channel of communication between users who occupy a weaker position in society and are hampered by linguistic inaccessibility, on one hand, and stronger, institutionalized elements of society, on the other, provides particularly strong motivation for understanding preconceptions of MT. Attitudes toward MT technology may influence not only the basic features of the communicative act, which is already defined by unequal power relations, but also how the translated text itself—its informational, cultural, and emotional content, and its functional effectiveness—is being perceived. This could also result in people taking further steps to actively adjust or post-edit the algorithmic output according to their own beliefs. Along these lines, negative bias against MT technology might diminish the extent to which stronger elements of society believe they are having an effective, meaningful interaction with MT’s disadvantaged end users—users who may strongly rely on this technology for communication. Our experiment wishes to examine whether people’s different preconceptions of human translation and MT indeed lead them to evaluate differently the features of an otherwise identical translation when it is attributed to a human or a machine, in a situation of power imbalance.

Evaluating translation

Ever since the 1970s, translation studies scholars have debated, directly or by implication, the criteria by which a translation should be described and evaluated. The passing years saw no consensus emerge over the subject. Competing criteria for evaluating translation reflected, above all else, the questions researchers found to be most important (Pym, 2014). Moreover, as Andrew Chesterman notes, the “quality” of a translation is never an absolute value; it reflects subjective judgments, means different things to different actors and presupposes different perspectives, and approaches to its evaluation may take different courses (in: Pym, 2020a: 439–440). However, when considering people’s perceptions of the translation product, the diverse ways for evaluating a translation—whether the product of a human or a machine—can rather usefully complement one another, as they reflect the various dimensions that are important to the actors involved in the translation process.

One dimension of translation that has long interested translation scholars is the perceived equivalence between the translation and the source text, with reference to meaning and stylistic features (Nida, 2012). Rooted in linguistics, computer science, or literary criticism, researchers with this perspective often imply that a text has an intrinsic “meaning” which is, to a large degree, autonomous and independent of the cultural context in which it is situated (Pym, 2014: 6–23). Taking the equivalence of meaning and form between the source text and the translation as the key concept for evaluation, their studies stress the linguistic, semantic, and stylistic attributes of the two texts, and often make claims with regard to the accuracy and reliability of a translation by means of formal comparison with its source.

Born of translation studies “cultural turn” in the 1990s, other trends take as a point of departure the discursive and performative differences between cultures, and the inherently subjective, normative status of the abovementioned concept of “equivalence.” They stress the cultural adjustments that are seen as needed, or are actually involved, in the translation of a source text for the receiving culture’s audience. These research perspectives tend to characterize translation primarily by how its various representations are appropriated to the receiving culture’s ideologies and expectations (Lefevere, 1992). Choices in the translation may also pertain to the transfer of emotional content that is encoded in the source text, which is expressed in line with culture-specific conventions of discourse (Tabakowska, 2018; Walinski, 2018). The ability to convey emotional and cultural otherness may be particularly relevant in contexts where translation is meant to serve as a bridge between members of different cultures in an emotionally laden situation. In these contexts, the conveying of cultural otherness is seen by some scholars to embody the very moral essence of the communicative act (Venuti, 1998).

Another influential theoretical approach in translation studies implies as a leading criterion for evaluating translation its functional effectiveness in serving its assigned purpose, or skopos (Pym, 2014: 43–61). Here, the value or “quality” of the translation is contingent on its achieving a given function in particular settings, and translations are primarily defined and differentiated by their goals (Reiss and Vermeer, 2014). If the setting involves power relations and ethical considerations, the translation’s perceived effectiveness needs to be assessed with reference to these particular features.

The abovementioned approaches to describing and evaluating a translation are far from exhaustive. Moreover, while they are presented here separately, they should not be seen as truly exclusive of one another. When it comes to people’s subjective intuitions, the level of accuracy of textual equivalence, the ability to convey cultural and emotional content, and the functional effectiveness of achieving a given purpose, represent three related but different emphases. By incorporating corresponding ways of evaluation in our experiment, through particular questions that aim to address each of their basic features, we may therefore gain complementary perspectives and useful insight on how people perceive MT.

When a translation is attributed to an algorithm, measuring people’s evaluations of the translation’s features involves yet another important dimension. Scholars have demonstrated that the positive or negative disconfirmation of previous expectations of technology—that is, the extent to which the performance of that technology was seen to exceed or fall below expectations—may influence people’s evaluations of that technology (Lankton et al., 2014). As automation capabilities constantly, though not linearly, change and improve, it is therefore necessary to take into account users’ expectations of algorithmic performance, whenever perceptions and evaluations of the performance of that algorithm are studied. Focusing on perceptions of MT, then, we needed to know the extent to which people’s expectations of the quality of the MT product were positively or negatively disconfirmed, and to examine whether this indeed played a role in determining people’s evaluations of the translation.

Intervening in the translation

The issue of the translator’s “faithfulness” to the source text has long been a source of polemic in public and academic discourse on translation. Various scholars and thinkers have debated the extent to which the translator may depart from the linguistic, cultural or ideological traits of the source text, and reshape the original message as a means to different goals: achieving greater elegance or comprehensibility (Nida, 2012), adapting to target audience conventions for commercial reasons (Pym, 2014: 117–137), or undermining social, gender, and racial hierarchies ingrained in the source text (Venuti, 2008; Von Flotow, 1991), among others. In the field of community interpreting, which deals more directly with unequal power relations on the ground such as those examined in our experiment, practitioners and scholars have historically endorsed the notion of “impartiality,” demanding that no changes be introduced to the content of the original message under any circumstances (Hale, 2007; Kotzé, 2014: 127). Their approach has largely corresponded to the prevalent norms and accepted practices in textual translation. Some recent discourse, however, has aimed to redefine the role of translators and interpreters by calling into question the possibility, or necessity, of “impartiality” (Wolf, 2012). These voices set aside the commitment to the original text, and suggest that the interpreter’s first ethical priority, particularly in asymmetrical settings of legal, community, or medical interpreting, should be the values of the community and perceived needs of the interpreted person (Ciordia, 2017: 275–281). Such a view echoes the work of translation ethicist Anthony Pym, who sees ethics as “concerned primarily with what particular individuals do in the immediacy of concrete situations,” and proposes that “since translation is a cross-cultural transaction, the translator’s task is one of fostering cooperation between all concerned, with the aim of achieving mutual benefit and trust” (Hermans, 2009: 96).

Looked at from an ethics perspective, then, the impact of interventionist approaches on the translation differs across cases, depending on the motives for intervention and how the involved actors choose to reframe the original text. A plethora of studies have discussed the various implications of nationalistic, conservative interventions; progressive, anti-chauvinistic interventions; or more intricate, ideologically complex interventions in translation. Whatever its ideological orientation, it is clear that the possibility of intervention in the source text should not be ignored in any discussion of translation.

Nonetheless, while interventions in translation, and the issue of the translator’s own agency in ethically charged situations, have drawn increasing attention in the research, they have not yet been examined with regard to translations produced by a machine. Our experiment aimed to examine whether people felt the need to reshape the original message and introduce significant changes to an MT product to help a disadvantaged MT user; and, if they did indeed feel this need, to try to understand which factors led to these feelings, and whether the wish to intervene in the translation was different when the translation was attributed to a human or a machine.

Research goals

Our main research goal was to understand how people’s preconceptions of MT may influence how they evaluate the MT product, and how they may then suggest to act with regard to the translation, in an ethically charged situation. To that end, we compared how people assessed an otherwise identical translation when it was attributed to a human translator or MT, in a scenario where a disadvantaged immigrant worker was wronged and needed to have his official complaint translated. We wished to examine whether people evaluated differently, in each condition (human translator/MT), the translation’s accuracy and reliability, its ability to convey cultural and emotional otherness, and its potential functional effectiveness in helping the plight of the immigrant in need of the translation. We wanted to assess these evaluations of the translation while taking into consideration people’s disconfirmation of prior expectations of the quality of the translation, and their having (or lacking) working experience as a professional translator. Finally, we wished to examine, in both conditions, people’s belief in the need for an interventionist approach to the translation in question.

At the same time, we wanted to see whether differences between people’s evaluations of the translation were influenced by variables other than their predispositions toward algorithms, that is, by demographic or psychological factors that could have an impact on how one perceived the scenario and subsequently the translation itself. These control variables include gender, age, previous experience with MT, level of education, fluency in English (the language of the source text), and academic/professional background (humanities, exact sciences, social sciences). They also include subjective factors specific to the scenario in question, such as the moral–political orientation to immigrant workers and their treatment by the system, represented by participants’ perspectives on the level of unjustness suffered by the immigrant in the situation; judgment of the effectiveness of the original complaint written by the immigrant; and one’s general conceptions about the legitimacy of an interventionist ideology to translation in ethically charged situations.

Method

An online experiment exposed participants to a scenario in which a translated text was attributed either to MT or to a human translator. The survey was conducted from 7–30 July 2020, and addressed a Hebrew-speaking population. All analyses were done using IBM SPSS 26 and JMP Pro 15.

Sample

Two hundred ninety-one participants took part in an online survey distributed among students and staff from different faculties in a large state university, and among people with professional experience in translation. Participants were recruited from across the humanities, exact sciences (including engineering), and social sciences; and from the Israel Translator Association, as well as alumni and teachers from programs of translation training.

Participants’ age ranged from 19 to 81 (average = 41.21, SD = 16.81); 69% of them were female. Most of the participants were highly educated: 61% reported to hold or study toward a graduate degree, 37% were in the exact sciences, 37% in humanities and arts, and the rest in the social sciences. All the participants were fluent in the local language (Hebrew), while 79% reported to also be fluent in English. Of the participants (88), 30% had some professional experience as translators. Due to wrongly entered data in seven of the completed forms, seven participants had to be excluded from the final sample, which has 284 participants.

Procedure and stimuli

An online questionnaire with two conditions (human translator/MT) was designed using Qualtrics software. Participation was voluntary. The questionnaire started by presenting participants with a scenario where L., a Philippine immigrant worker, was assaulted by security guards due to misidentification, and needed to submit an official complaint to request compensation for the damage he had suffered. Lacking proficiency in the local language (Hebrew), L. wrote his complaint in basic English, which then needed to be translated into Hebrew for the village officials (see Appendix 1 for the wording of the scenario and Appendix 2 for that of the complaint).

Participants were asked to report on their perspectives on the level of unjustness suffered by L. in the described situation; their perspectives on the potential effectiveness of the original complaint in securing compensation for L.; and their general perspectives on the legitimacy of an interventionist ideology on the part of the translator in such a situation.

Then, participants were randomly assigned to one of the two conditions (human translator or MT). Participants assigned to the human translator condition learned that the immigrant worker’s text was indeed translated by a professional translator approached by the village officials, who had previous experience with immigrant workers and was a member of an immigrant rights organization. Participants assigned to the MT condition learned that the human translator was unavailable, so the text was translated with “the most advanced MT software available in the market today.” Participants in both conditions were then presented with the immigrant worker’s original message alongside its Hebrew translation and were asked to evaluate the translation. Their evaluations involved three main features of the translation: its accuracy and reliability, its ability to convey cultural and emotional otherness, and its assumed functional effectiveness in helping the plight of the immigrant in need of the translation. Participants also reported on their beliefs in the need for an interventionist approach to the translation itself, and evaluated the level of positive disconfirmation of their prior expectations of the translation in light of the quality of the translation actually presented to them. We also collected data on demographics, including gender, age, level of education, previous experience with MT, and professional experience in translation.

The immigrant’s official complaint (Appendix 2) was written following consultation with a cultural mediator who works with immigrant workers in Israel, and revised according to her comments. It was also formulated to be in line with basic features of communication in Filipino culture (Scroope, 2017). The translation of the complaint provided in both conditions was identical, and done by Google Translate. For the translation to plausibly pass as a human translation, the original English message had to be pre-edited, prioritizing short sentences, uncomplicated syntax, and common word choices, so as to avoid grammatical errors or lexical peculiarities in the translation. These linguistic features for the official complaint made sense in our scenario, as they corresponded to the level of English proficiency of most Philippine immigrant workers in Israel. The source text was further fine-tuned according to the MT output until the translation achieved a level of language idiomaticity that could pass as a human translation (confirmed by pretests among colleagues). This pre-editing was crucial as we needed the same exact translation to be plausibly attributed to a human and a machine. It was also important to include the original English message in the survey, as this transparency enabled participants, most of whom were highly fluent in both languages, to base their evaluation of the translation on comparing the two texts (rather than just appraising the translated text itself), thus enabling us to assess their subjective predispositions toward MT while controlling the effect of people’s proven distrust of non-transparent algorithms (Glikson and Woolley, 2020).

Measures

Dependent variables

All variables were measured by asking participants to rate the extent to which they agree with given statements regarding the described scenario and the translation, using a 7-point Likert-type scale (1—strongly disagree; 4—neither agree nor disagree; and 7—strongly agree). To explore the structure of our data, we conducted an exploratory factor analysis (EFA), using an eigenvalue equal to or greater than 1.00 to decide the number of extracted factors (Costello and Osborne, 2005). EFA revealed a seven factor model for seven (four dependent and three control) variables used in this study (λ₁ = 7.95, λ₂ = 2.53, λ₃ = 2.31, λ₄ = 2.09, λ₅ = 1.55, λ₆ = 1.31, and λ₇ = 1.00) accounting for 72% of explained variance (see Appendix 3 for a list of the items used and their factor loadings). The internal reliability of the variables was assessed using Cronbach’s alpha. All items of the dependent and control variables, summarized briefly below, are reported in Appendix 3 in detail.

Perspectives on the translation’s accuracy and reliability (M = 5.35, SD = 1.52) were measured using two items—one for each concept (Cronbach’s α = .89).

Perspectives on the translation’s ability to convey cultural and emotional otherness (M = 4.87, SD = 1.31) were measured using six items: four questions regarding the translation’s success in conveying the immigrant worker’s feelings and the gist and emotional content of the complaint, and two questions regarding the translation’s success in conveying the immigrant worker’s different cultural background, and in bridging the gap between the local culture and the immigrant’s (Cronbach’s α = .87).

Perspectives on the translation’s functional effectiveness (M = 4.59, SD = 1.34) in the situation at hand were measured using four items: three questions regarding the extent to which the translation is expected to help the plight of the immigrant by enabling him to receive better treatment from the village officials, and secure the compensation he requests; and one question regarding the extent to which the translation exhibits a moral commitment to the immigrant (Cronbach’s α = .81).

The wish to apply an interventionist approach (M = 3.31, SD = 1.51) to the translation in question was assessed by the extent to which participants felt that omissions from and additions to the original message should have been carried out in the translation to help L. achieve his goals. Five items were used (Cronbach’s α = .86).

Control variables

The disconfirmation of prior expectations of the translation (M = 4.58, SD = 1.96) was measured by asking participants to rate the extent to which they were positively surprised by the quality of the translation presented to them. It was crucial not to prime the participants, particularly in the MT condition, with a reference to expectations of quality in the beginning of the experiment, as this could have affected their subsequent grading of the translation’s features. Therefore, participants were asked to report on the disconfirmation (if any) of prior expectations of the quality of the translation only after they completed all of their evaluations.

Perspectives on the unjustness of the situation (M = 4.75, SD = 1.71) were assessed by participants rating the extent of unfairness and unjustness of the situation, and the degree to which they felt the responsibility for the incident was faulty behavior on the part of the village’s security guards. Three items were used (Cronbach’s α = .85).

Beliefs in an interventionist ideology (M = 2.64, SD = 1.44) regarding translation were assessed by participants rating whether the translator’s loyalty should, in principle, lie solely with the original message or if the translator should intervene in the text to assist the immigrant worker. We used two items that measured the belief that, even if the translator wishes to help, they must refrain from introducing any changes to the source text. The items were reversed (Cronbach’s α = .75).

Perspectives on the effectiveness of the original complaint (M = 5.07, SD = 1.28) written by the immigrant worker were assessed using three items. The items asked to what extent the complaint was effectively written, and to what extent did L. have a good chance to receive compensation for the medical treatment he required (Cronbach’s α = .79).

Results

Descriptive statistics and correlations between variables can be found in Table 1.

Table 1.

Descriptive statistics and correlations between experiment variables.

	M	SD	1	2	3	4	5	6	7	8	9	10
1. Translator identity^a	0.51	0.50
Dependent variables
2. Translation’s accuracy and reliability	5.35	1.52	–.18**
3. Translation’s ability to convey cultural and emotional otherness	4.87	1.31	–0.02	.53**
4. Translation’s functional effectiveness	4.59	1.34	–0.02	.25**	.61**
5. Interventionist approach to the translation	3.31	1.51	0.03	–.23**	–.52**	–.43**
Control variables
6. Positive disconfirmation	4.58	1.96	.46**	.32**	.44**	.32**	–.27**
7. Unjustness of the situation	4.75	1.71	0.03	–0.1	–0.07	–0.09	.14*	–0.01
8. Interventionist ideology	2.64	1.44	–0.08	–0.04	–0.09	–.14*	.15*	–0.05	0.07
9. Effectiveness of original complaint	5.07	1.28	0.05	.14*	.39**	.49**	–.30**	.27**	–0.11	–.14*
10.Age	40.90	16.70	0.07	–0.01	.14*	–0.01	–.15*	0.10	–0.04	0.10	–0.05
11.Professional experience in translation^b	0.30	0.46	0.03	–0.04	0.02	–0.06	–0.06	0.03	.14*	.13*	–0.04	.27**

N = 284.

Translation identity (MT = 1; human translator = 0).

Professional experience in translation (1 = had experience; 0 = had no experience).

p < .05; **p < .01.

To examine how assigning the translated text to an algorithm influences participants’ perspectives of the translation, we conducted hierarchical regression analysis (see Table 2). Our first step was to assess how preconceptions of MT influenced evaluations of the different features of the translated text, and the tendency for an interventionist approach to the translation, without taking any of the control variables into account. Our results demonstrated that predispositions toward MT had a significant effect on people’s evaluations of the translation’s accuracy and reliability: the translation attributed to a machine was perceived as less accurate and less reliable (M = 5.08, SD = 1.60) than that attributed to a human translator (M = 5.62, SD = 1.40; F(1, 282) = 9.09, p < .005). However, there was no significant effect on participants’ evaluations of other features of the translation.

Table 2.

Hierarchical linear modeling analysis of the effect of translator identity (human/machine) on evaluations of the translation and on an interventionist approach to it.

	Accuracy and reliability		Ability to convey cultural and emotional otherness		Functional effectiveness		Interventionist approach to the translation
	Beta	t	Beta	t	Beta	t	Beta	t
Step 1
Translator identity^a	–.18**	–3.02	–.02	–.39	–.02	–.25	.03	.51
Step 2
Translator identity^a	–.41**	–6.91	–.29**	–4.92	–.21**	–3.28	.19**	3.03
Positive disconfirmation	.50**	8.53	.57**	9.88	.42**	6.66	–.36**	–5.58
Step 3
Translator identity^a	–.40**	–6.78	–.28**	–4.88	–.17**	–3.07	.19**	3.04
Positive disconfirmation	.50**	8.05	.51**	8.59	.29**	4.90	–.27**	–4.27
Unjustness	–.07	–1.38	–.03	–.56	–.03	–.53	.10	1.79
Interventionist ideology	–.03	–.62	–.07	–1.38	–0.08	–1.56	.14*	2.50
Effectiveness of original complaint	.01	.23	.17**	3.22	.41**	7.66	–.22**	–3.79
Age	–.02	–.32	.13*	2.39	.01	.15	–.14*	–2.52
Professional experience in translation^b	–.02	–.30	–.01	–.09	–.04	–.69	–.06	–.98
R²: Step 1	.03*		.01		.01		.01
ΔR²: Steps 1–2	.20*		.25*		.14*		.10*
R²: Step 2	.23*		.25*		.14*		.10*
ΔR²: Steps 2–3	.01		.04*		.17*		.10*
R²: Full model	.24*		.29*		.31*		.20*

N = 284. Beta stands for standardized coefficient;

Translation identity (MT = 1; human translator = 0).

Professional experience in translation (1 = had experience; 0 = had no experience).

p < .05; ^**p < .01.

Our second step was to add to our model the effect of positive disconfirmation, and take it into account as a control variable. It is worth stressing that participants who were more positively surprised by the translation quality in both conditions assigned higher grades to the features of the translation they were asked to evaluate. However, as participants in the MT condition tended to have lower expectations of the translation quality than participants in the human translator condition, the former demonstrated more positive disconfirmation (disconfirmation in MT condition: M = 5.46; SD = 1.48; in human translator condition: M = 3.67; SD = 1.97), which in turn mitigated the negative effect that attributing the translation to a machine had on their evaluations. Therefore, when the variance in positive disconfirmation was taken into account using disconfirmation as a control variable, the implications of attributing the translated text to a machine became clearer. The effect of the translator’s identity was now evident in all evaluations of the translation, which were significantly lower in the MT condition than in the human translator condition: accuracy and reliability (for MT: M = 4.32, SE = 0.11; for human translator: M = 4.86, SE = 0.11; F(2, 281) = 42.08, p < .001); ability to convey cultural and emotional otherness (for MT: M = 4.50, SE = 0.10; for human translator: M = 5.25, SE = 0.10; F(2, 281) = 48.88, p < .001); and functional effectiveness (for MT: M = 4.32, SE = 0.11; for human translator: M = 4.86, SE = 0.11; F(2, 281) = 22.19, p < .001). In contrast, the wish to intervene in the translation was significantly higher in the MT condition (for MT: M = 3.60, SE = 0.13; for human translator: M = 3.13, SE = 0.13; F(2, 281) = 15.69, p < .001). For standardized beta coefficients, see Table 2.

Our third step was to take all control variables into account, including the participants’ age and professional experience in translation, as well as their perspectives on the level of unjustness of the situation and on the effectiveness of the immigrant worker’s original complaint. As evident from Table 2, the effect of the translator’s identity on participants’ evaluations of the translation was significant above and beyond the control variables. The level of positive disconfirmation, too, still significantly pertained to people’s evaluations of the translation. Most factors, such as gender, level of education, fluency in English, professional background, and, interestingly, previous experience with MT, and the perceived unjustness of the situation, did not have any significant effect on the evaluations of the translation.

In addition, we found that the perceived effectiveness of the immigrant worker’s original complaint influenced the evaluations of the two features of the translation that were directly tied to the immigrant’s situation in the described scenario: successfully conveying cultural otherness, and helping the plight of the immigrant (functional effectiveness). The less people felt the original complaint was written effectively, the lower they evaluated the translation’s success in conveying cultural otherness (b = .17, p < .01) and functional effectiveness (b = .41, p < .001), and the higher they rated their wish to apply an interventionist approach to the translation (b = –.22, p < .01). Understandably, the control variable representing people’s belief in an interventionist ideology regarding translation in general had an effect only on the participants’ wish to intervene in the translation in question (b = .14, p < .05). Perhaps most interestingly, we found no difference between participants with previous experience in translation and non-translators in all respects: both populations exhibited the same exact inclinations, and same exact biases, with regard to their evaluations of the translation, and to their wish to intervene in it.

Discussion and conclusion

A fact not often acknowledged in translation studies is the increasingly small proportion of translations done by professional translators worldwide. However, it has been recently estimated that 99% of the translations produced globally are not done by professional human translators (Pym, 2020b). Much of this trend is a consequence of the rapid rise of MT. A significant aspect of this trend has been the growing use of MT in multilingual societies: people who lack linguistic accessibility increasingly find themselves in situations of power imbalance, with MT as a main medium of communication. Our findings suggest that, apart from the sociocultural and political factors at play in these situations, the role of people’s preconceptions of and existing bias against MT must also be taken into account—particularly if the MT-mediated communicative act involves moral considerations. In this respect, MT may be seen as representative of the broader phenomenon of AI-MC tools being integrated into cross-linguistic and cross-cultural communication.

The findings presented in this article open a door to research on the social implications of this trend. Our findings echo the studies that demonstrate the largely negative attitudes of professional translators to MT, but show that a negative disposition toward MT is common also to end users who are not translators (which, interestingly, does not correspond to the largely superficial, positive framing of MT in the press, see Vieira, 2020b). Non-translator end users, despite being by far the main users of the technology, have been largely overlooked in sociological-oriented research of MT. Our study shows for the first time, as far as we are aware, that the negative predisposition toward MT leads to a bias in the evaluations of various features of the MT product, and that this bias is found among translators and non-translators alike. Moreover, the different criteria suggested for evaluating MT in an ethically charged situation were found to be statistically distinct from each other, and internally reliable as dependent variables. They were all influenced by the bias against the machine, but the predisposition against MT was found to influence each of them differently in the scenario in question. In other words, we identified and statistically established a set of factors that may be considered a point of departure for the study of attitudes to MT in other social situations of communication.

Our findings suggest new directions of empirical research on evaluations of MT in other ethically charged contexts. The translation’s accuracy and reliability, its ability to convey cultural otherness, and its functional effectiveness in helping the person in need of the translation, may be perceived differently in different situational contexts. People’s beliefs in an interventionist approach to MT also require further examination in this respect. In our scenario, people wished to introduce changes to an MT product more than to a human translation as a direct consequence of their (bias-induced) lower evaluations of MT. It would be interesting to see whether MT-mediated communication in other scenarios also “motivates” an interventionist approach to the original message, and to try to better understand the implications of these interventions in the translation in future research.

Another important point raised by our findings has to do with the shifting expectations from MT, and how these should be taken into account in the assessment of factors involved in people’s perceptions of AI-MC more generally. Our findings suggest that we should not consider separately how people’s bias toward AI-MC manifests in their evaluations, and how their positive disconfirmation of prior expectations influence their evaluations. Rather, in a world in which technology constantly changes and people’s expectations change with it, a sound understanding of people’s perceptions of AI-MC requires taking both of these factors into account. The positive disconfirmation of prior expectations of lower quality from the MT product, together with the negative bias against MT, were shown to conjointly determine the eventual outcome of people’s evaluations of the MT product.

The ever-increasing volume of communication mediated by MT puts the meaning of our findings into sharp relief. It is clear that apart from the “objective” linguistic quality of MT, people’s predispositions toward MT may influence basic features of the ethically charged communicative acts it mediates. Future research is still needed to determine if distrust in MT may lead to overly negative perceptions of the informational, cultural, and emotional content of the translation, and, by extension, even of the users of the MT product themselves—some of whom are socially disadvantaged and strongly depend on it. What is clear, however, is that the bias against MT, demonstrated by translators and non-translator users alike, may have broad implications for intercultural communication worldwide. We need to understand it better to grasp the dramatic shifts that are occurring globally as a result of MT-mediated communication across cultures and social positions.

Footnotes

Appendix 1

Appendix 2 Appendix 3. List of items used for each variable and their factor loading.

Items	Factor 1	Factor 2	Factor 3	Factor 4	Factor 5	Factor 6	Factor 7
Dependent variables
The translation’s accuracy and reliability (.89)
The translation
is accurate						.94
is reliable						.69
The translation’s ability to convey cultural and emotional otherness (.87)
The translation
succeeds in conveying L.’s feelings	.67
succeeds in retaining the gist of what L. tried to convey	.63
conveys L.’s emotions in an effective way	.82
succeeds in evoking feelings just like the source text	.66
succeeds in bridging the cross-cultural gap between L. and the local culture	.50
succeeds in conveying the otherness of L.’s cultural background to the village authorities	.69
The translation’s functional effectiveness (.81)
The translation
may help L. in terms of the treatment he receives from the village officials			.60
may help L. achieve what he wants			.77
may help L. deal with the situation he got involved in			.76
exhibits a moral commitment to L.			.55
An interventionist approach to the translation (.86)
In the current circumstances, it would have been preferable if the translation had introduced changes to the source text		.83
It would have been better to add certain elements that did not originally appear in L.’s message		.77
Items	Factor 1	Factor 2	Factor 3	Factor 4	Factor 5	Factor 6	Factor 7
Dependent variables
It would have been better to omit certain elements that originally appeared in L.’s message		.55
A greater effort should have been put into the translation to convey L.’s message in an optimal way		.52
Changes should have been introduced to the translation to help L. achieve his goals		.81
Control variables
DisconfirmationI was positively surprised by the quality of the translation
Unjustness of the situation (.85)
The described situation reflects unjust circumstances				.93
The described incident represents an unfair situation				.95
The responsibility for the incident was faulty behavior on the part of the village’s security guards				.54
Interventionist ideology (.75)
In principle, the translator must always remain completely faithful to the source text (reversed)							.79
In principle, even if the translator wishes to help [the person in need of the translation], they must refrain from introducing any changes to the source text (reversed)							.75
The effectiveness of the original complaint (.79)
The complaint is effectively written					.44
L. has a good chance to receive compensation for his damaged goods					.87
L. has a good chance to receive compensation for the medical treatment he required					.87

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors received support for the research and publication of this article from the Bar-Ilan University Rector Grant for Interdisciplinary Research Groups.

ORCID iD

Omri Asscher

Author biographies

Omri Asscher is a Senior Lecturer at the Department of Translation and Interpreting Studies in Bar-Ilan University. His work explores the ideological role translation plays in homeland-diaspora frameworks, and, more recently, the implications of machine translation in intercultural communication. His book, Reading Israel, Reading America: The Politics of Translation between Jews, was published by Stanford University Press (2019).

Ella Glikson is a Lecturer at the Graduate School of Business Administration in Bar-Ilan University. Her research focuses on computer-mediated and cross-cultural communication in organizational and social settings, as well as human perceptions of artificial intelligence.

References

Bouillon

Gerlach

Spechbach

, et al. (2017) BabelDr vs Google translate: a user study at Geneva University Hospitals (HUG). In: 20th Annual conference of the European Association for Machine Translation, Prague, Czech Republic, 29–31 May. Available at: https://archive-ouverte.unige.ch/unige:94511

Bowker

Buitrago Ciro

(2015) Investigating the usefulness of machine translation for newcomers at the public library. Translation and Interpreting Studies 10(2): 165–186.

Cadwell

O’Brien

Teixeira

CSC

(2018) Resistance and accommodation: factors for the (non-) adoption of machine translation among professional translators. Perspectives: Studies in Translatology 26(3): 301–321.

Ciordia

(2017) A conceptual and contemporary approach to the evolution of impartiality in community interpreting. The Journal of Specialised Translation 28: 273–292.

Costello

Osborne

(2005) Best practices in exploratory factor analysis: four recommendations for getting the most from your analysis. Practical Assessment, Research and Evaluation 10(7): 1–9.

Cronin

(2012) Translation in the Digital Age. London: Routledge.

Davenport

(2018) Google translate processes 143 billion words every day. Android Police, 9 October. Available at: https://www.androidpolice.com/2018/10/09/google-translate-processes-143-billion-words-every-day/ (accessed 26 April 2021).

Dietvorst

Simmons

Massey

(2015) Algorithm aversion: people erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology 144(1): 114–126.

Forcada

(2017) Making sense of neural machine translation. Translation Spaces 6(2): 291–309.

10.

Glikson

Woolley

(2020) Human trust in artificial intelligence: review of empirical research. Academy of Management Annals. Epub Ahead of Print 10 August. DOI: 10.5465/annals.2018.0057.

11.

Google (2016) Ten years of Google translate. In: Google Translate Blog. Available at: https://www.blog.google/products/translate/ten-years-of-google-translate/ (accessed 26 April 2021).

12.

Hale

(2007) Community Interpreting. New York: Palgrave Macmillan.

13.

Hancock

Naaman

Levy

(2020) AI-mediated communication: definition, research agenda, and ethical considerations. Journal of Computer-Mediated Communication 25(1): 89–100.

14.

Hassan

Aue

Chen

, et al. (2018) Achieving human parity on automatic Chinese to English news translation. Available at: http://arxiv.org/abs/1803.05567

15.

Hermans

(2009) Translation, ethics, politics. In: Munday

(ed.) The Routledge Companion to Translation Studies. London: Routledge, pp. 93–105.

16.

Hohenstein

Jung

(2020) AI as a moral crumple zone: The effects of AI-mediated communication on attribution and trust. Computers in Human Behavior 106: 106190.

17.

Hutchins

(2009) Multiple uses of machine translation and computerised translation tools. In: Proceedings of the international symposium on data and sense mining, machine translation and controlled languages (ed. S

Cardey

), Besancon, 1–3 July, pp. 13–20. Besancon: Université de Franche-Comté.

18.

Inghilleri

(2017) Translation and Migration. London: Routledge.

19.

Jago

(2019) Algorithms and authenticity. Academy of Management Discoveries 5(1): 38–56.

20.

Kotzé

(2014) Educational interpreting: a dynamic role model. Stellenbosch Papers in Linguistics Plus 43: 127–145.

21.

Lankton

McKnight

Thatcher

(2014) Incorporating trust-in-technology into expectation disconfirmation theory. Journal of Strategic Information Systems 23(2): 128–145.

22.

Läubli

Orrego-Carmona

(2017) When Google translate is better than some human colleagues, those people are no longer colleagues. In: Proceedings of the 39th Conference Translating and the Computer, One Birdcage Walk, 16–17 November, pp. 59–69. Geneva: Editions Tradulex.

23.

Lee

(2018) Understanding perception of algorithmic decisions: fairness, trust, and emotion in response to algorithmic management. Big Data and Society 5(1): 1–16.

24.

Lefevere

(1992) Translation, Rewriting, and the Manipulation of Literary Fame. London: Routledge.

25.

Melby

(2020) Future of machine translation: musings on weaver’s memo. In: O’Hagan

(ed.) The Routledge Handbook of Translation and Technology. London: Routledge, pp. 419–436.

26.

Nida

(2012) Principles of correspondence. In: L

Venuti

(ed) The Translation Studies Reader. London: Routledge, pp. 141–155.

27.

Nurminen

Koponen

(2020) Machine translation and fair access to information. Translation Spaces 9(1): 150–169.

28.

O’Hagan

(2020) The Routledge Handbook of Translation and Technology. London: Routledge.

29.

Pym

(2012) On Translator Ethics: Principles for Mediation Between Cultures. Amsterdam: John Benjamins.

30.

Pym

(2014) Exploring Translation Theories. London: Routledge.

31.

Pym

(2020a) Quality. In: O’Hagan

(ed.) The Routledge Handbook of Translation and Technology. London: Routledge, pp. 437–452.

32.

Pym

(2020b) Translator ethics: from cooperation to risk and trust (Translation Seminar Series). Centre for Translation, Hong Kong Baptist University. Available at: http://hkbutube.lib.hkbu.edu.hk/st/display.php?bibno=st969 (accessed 26 April 2021).

33.

Reiss

Vermeer

(2014) Towards a General Theory of Translational Action: Skopos Theory Explained. London: Routledge.

34.

Rossi

Chevrot

(2019) Uses and perceptions of machine translation at the European Commission. Journal of Specialised Translation 31: 177–200.

35.

Sakamoto

(2019) Why do many translators resist post-editing? A sociological analysis using Bourdieu’s concepts. Journal of Specialised Translation 31: 201–216.

36.

Scroope

(2017) Filipino culture. In: Cultural Atlas. Available at: https://culturalatlas.sbs.com.au/filipino-culture/filipino-culture-communication (accessed 26 April 2021).

37.

Sundar

Jia

Waddell

, et al. (2015) Toward a theory of interactive media effects (TIME): four models for explaining how interface features affect user psychology. In: Sundar

(ed.) The Handbook of the Psychology of Communication Technology. Malden, MA: Wiley Blackwell, pp. 47–86.

38.

Tabakowska

(2018) Emotional valuation: values and emotions in translation. In: Blumczynski

Gillespie

(eds) Translating Values: Evaluative Concepts in Translation. New York: Palgrave Macmillan, pp. 37–56.

39.

Toral

Castilho

, et al. (2019) Attaining the unattainable? Reassessing claims of human parity in neural machine translation. In: Proceedings of the 3rd conference on machine translation: research Papers, Brussels, 31 October–1 November, pp. 113–123. Stroudsburg, PA: Association for Computational Linguistics.

40.

Tymoczko

Gentzler

(2002) Translation and Power. Amherst, MA: University of Massachusetts Press.

41.

Venuti

(1998) The Scandals of Translation: Towards an Ethics of Difference. London: Routledge.

42.

Venuti

(2008) The Translator’s Invisibility. London: Routledge.

43.

Vieira

(2020a) Automation anxiety and translators. Translation Studies 13(1): 1–21.

44.

Vieira

(2020b) Machine translation in the news. Translation Spaces 9(1): 98–122.

45.

Vieira

O’Hagan

O’Sullivan

(2020) Understanding the societal impacts of machine translation: a critical review of the literature on medical and legal use cases. Information, Communication and Society. Epub ahead of print 16 June. DOI: 10.1080/1369118x.2020.1776370.

46.

Von Flotow

(1991) Feminist translation: contexts, practices and theories. TTR: Traduction, Terminologie, Rédaction 4(2): 69–84.

47.

Walinski

(2018) The cross-cultural conceptualization of ‘I’m lovin’ it’ between English and Polish. Kwartalnik Neofilologiczny LXV(4): 517–530.

48.

Wolf

(2012) The sociology of translation and its “activist turn.” Translation and Interpreting Studies 7(2): 129–143.

49.

Yao

Viswanath

Cryan

, et al. (2017) Automated crowdturfing attacks and defenses in online review systems. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, Dallas, TX, 30 October–3 November 3, pp. 1143–1158. New York: ACM. Available at: https://doi.org/10.1145/3133956.3133990

Human evaluations of machine translation in an ethically charged situation

Abstract

Keywords

Introduction

Literature Review

Machine Translation

Evaluating translation

Intervening in the translation

Research goals

Method

Sample

Procedure and stimuli

Measures

Dependent variables

Control variables

Results

Discussion and conclusion

Footnotes

Appendix 1

Appendix 2

Appendix 3. List of items used for each variable and their factor loading.

Funding

ORCID iD

Author biographies

References