Abstract

1. Introduction
Technological innovation impacts and accelerates advances in different professions, and interpreting is no exception. It is no exaggeration to claim that a new paradigm is emerging in interpreting studies: the so-called augmented paradigm, which concerns (translator and) interpreter productivity via partnering with technology (Mihalache, 2021). In April 2022, I had the pleasure of remotely interviewing Dr. Claudio Fantinuoli on this timely topic, which has surprisingly attracted scant attention from the interpreting community so far. He is the head of artificial intelligence (AI) at KUDO Inc., and senior lecturer and current member of the Centre for Augmented Interpretation at the University of Mainz. He is the editor of the first book entirely focused on interpreting and technology (Fantinuoli, 2018), and author of many publications.
2. Interview
Dr. Fantinuoli, I am glad to have the pleasure of interviewing you.
Your work gains insights from disciplines such as artificial intelligence (AI) and communication studies, and you have outlined your vision for preparation for the future of interpreting, in which the augmented interpreter plays a pivotal role. As technology is making its presence felt at great speed in the diversified field of interpreting, it seems that the augmented paradigm is being born in this field. What is at the core of this trend?
Thank you for this opportunity and for this interesting question. I think that we should see the origin, if you like, of this paradigm shift outside of interpreting. Everything that is taking place now in interpreting has happened before in other professions. In fact, every professional activity has been, is and will be transformed by technology, particularly by what we call artificial intelligence (AI), that is, the ability of machines to take over, at least to some degree, tasks that in the past were in the sole domain of humans. You can observe this happening in any walk of life and in any profession. Interpreting is not immune to this, and we are now beginning to see that it is affecting multilingual communication as well. Compared to other disciplines and other professions, this happens quite late because interpreting has, on one hand, a long tradition of being a very human-centric and cognitive-oriented activity, and on the other, it is rather a small profession if you think for example of the relative limited numbers of practitioners working as professional interpreters, and this has made it partly immune to changes. Notwithstanding the delay, the technologization of professional life has started also in interpreting, and this is bringing changes to the profession very quickly, to be honest. The reason for this fast pace is that advances in information and communication technology (ICT) and, particularly, in AI have been constant in the last few years, and now they have reached a level of maturity that allow them to enter real-life. Socioeconomical changes are of course giving momentum to this.
So, you just mentioned that interpreting has lately started to see this shift. Are there possibly other reasons why this is happening now?
The main reason is really connected to the fact that we have now at our disposal technologies that, notwithstanding all their intrinsic limitations, are quite mature or in the process of achieving maturity. This is particularly true in the domain of natural language processing. We can now start using such advancements in real-life applications with the potential to make an impact on how multilingual communication is produced and consumed. There is also a more subtle reason: we have all sorts of professions which are embracing and using advanced technologies, irrespective of the fact that they are doing it on the base of a conscious or unconscious decision. This creates a sort of psychological group pressure to make use of technology whenever there is an opportunity for it. When you see that every profession is using new technologies or you see a mounting pressure coming from industrial stakeholders—as we know, money rules the world—people feel a pressure to improve their processes and to increase productivity while trying to maintain quality. I am not saying that technology is the solution to achieve these goals, but this is how society is engaging with such challenges at the moment. The consequence is that there is a lot of pressure to use more and more technology. This is especially true in societies dominated by free market principles.
This evolving human-machine interaction has started to challenge long-held assumptions about interpreting and transform the nature of basic interpreting sub-processes. Taking the emergence of the augmented paradigm into consideration, do you see the field as an interdiscipline, a transdiscipline, or a multidiscipline?
This is another great question. Interpreting and particularly interpreting studies have gone through different phases in the past. We had a strong focus on the cognitive aspects of language processing, that is, how interpreters’ brain works. This is true for some decades ago and it is having a revival right now. We also had a shift towards a more cultural-social framing of interpreting. This cultural turn in interpreting, following the path designed in Translation Studies, has been an important paradigm change in the way we try to make sense of interpreting. I would also add in the self-perceptions of its practitioners.
Now we may be going through a third shift, driven by the change of the entire ecosystem of interpreting, of which the augmentation is a part. I call it the Technological Turn. Here technology is, on one side, creating a disruption in the daily work of interpreters and in the way multilanguage communication is consumed, and on the other, and this is even more interesting for me, it is allowing us to frame interpreting using a different paradigm, the technological one. To make an interpreting-agnostic example, think about the advancements of AI in the language domain. AI is trying—partly successfully—to make machines process, understand, and produce texts in a similar way to humans. This is not only interesting as a practical field (the tool you can build), but it is giving us a new way to look at humans, the brain, the mind, and our language instinct. This is opening new doors to understanding humans, what is unique in us or, conversely, what is not that special, and to explain language itself.
Similarly, the human-machine interaction in interpreting, with all possible tools that you can use prior to and during interpreting, is interesting not only for its practical innovative potential, but also because it allows us to ask ourselves different questions about how interpreting works. More strikingly, when we see machine interpreting in action, that is, the full automation of the interpreting process, and we start to build and analyse such systems and see what they can do well and what they struggle with, we again have the opportunity to build a bridge to human interpreting and start asking new questions about the interpretating process. We are not limited to the cognitive or the social-cultural lenses, as in the past, but we can extend it by comparing it to how machines behave. This kind of intellectual exercise leads to new questions and, possibly, to new answers. This is quite interesting, I think. Unfortunately, it is something which is still a blank paper. There is not much effort going on in trying to frame interpreting through the lens of this technological turn. But I am convinced that this will happen sooner or later. Probably it did not happen before because of its novelty and the fact that most people believed that interpreting was immune to technologization. Now that, I hope, it has become obvious that this is not the case, through remote interpreting and the presence of AI, I guess that more and more people will get interested in this turn.
Even from a transdisciplinary perspective, if I have understood correctly?
Yes, absolutely.
AI is key to understanding this emerging paradigm shift in interpreting, and also to your own research, and automatic speech recognition (ASR) is one of AI domains with only a 70-year history. With their under-researched cognitive and social embeddings, they are transforming how interpreting is done as one of the ancient human practices. What significant insights may the field of AI offer us in facing major challenges in the field of interpreting as we move into the post-pandemic future?
This is a difficult question because it is hard to know what is going to happen in the future and, most importantly, to make sense of the opportunities and the challenges that this technological turn poses to the post-pandemic profession. There is no doubt, at least in my opinion, that the act or task of interpreting is not going to change dramatically. The competences, skills, and the core activity itself will remain very similar with or without AI, as they have remained very similar with or without remote interpreting. It is obvious that interpreting remotely or interpreting onsite is a very different experience not only for the interpreter but also for the participants. It requires a lot of adaptation, which is not only a technical one, but it is an adaptation of the mind-set. For example, you need to be able to understand the context without being really present in that context. You need more cognitive capacity to be devoted to this task. In some circumstances, you may need to be able to jump into a meeting from one hour to the next because this is how modern society works. For a meeting onsite, you generally have a longer time before your meeting starts: you need to reach the location, and you will probably talk and interact with people before the interpreting begins. It is not just a button that is clicked on your computer and you are online and you start interpreting. So, this changes a lot in interpreting, but its core activity remains the same.
The same seems to apply with AI. An increasing number of applications are entering real-life, the pace of adoption is supposed to increase, and as an interpreter you will have to interact with these tools in order to provide your service. Even if some aspects of the interpreting process may change, the act of interpreting remains basically the same. You may be using, for example, speech recognition as a supporting tool while you are interpreting and you did not understand something through the audio channel, or when so-called problem triggers arise, such as numbers and proper names. This, of course, changes the cognitive equation of interpreting because you have a new channel that you must manage, in this case the speech recognition or the suggestion system. But at the end of the day, you still need to make sense of these inputs, and it is you that are in charge of producing the most suitable rendition. So, interpreting changes, but not dramatically.
AI-support is offering the possibility to deliver high quality interpreting even in conditions that are becoming more complex. High quality will become pivotal in the future of human interpreting. The reason is simple: machine interpreting is entering the scene. This does not mean that machine interpreting will replace human interpreters any time soon. It will rather add multilingual accessibility to events and for languages that are not and cannot be covered by humans. There is only a fraction of events that is interpreted nowadays. Machine interpreting will extend this. At the same time, however, it is important to understand that machine interpreting will replace, sooner than later, human interpreters in all those situations where quality does not really matter, where “good enough” is sufficient, where risks are low, and so on. While professionals are by right focused on the high end of the market, we seem to forget that there are many of these “good enough”-situations in real life. Similar to written translation, this will push the quality that is expected from human interpreters. If you have to pay a human to provide a service that AI can deliver for no money, then you will do it consciously and only for those cases where humans can deliver an added value. The better machine interpreting will become, the better interpreters will need to be or, conversely, interpreters that are not able to deliver that minimum quality will be replaced. In institutions such as the EU or the UN, interpreters are requested to offer the highest quality standards which machines cannot match, maybe forever. Here, there should be no fear about AI taking over. The situation may be different in market segments where quality and risks matter less; segments that are served by less skilled or vocational interpreters, for example. Interestingly, I think that the situation I just described will apply to conference interpreting than to dialogic interpreting, where interactions between the parties, turn-takings, informality, and so on. are extraordinarily complex. The issue I see is that quality can be defined in different ways. Communication is not a mathematical equation where the result is either right or wrong. Everybody perceives quality differently. There is no doubt in my eyes, however, that minimum quality requirements will be higher than today.
Going back to the role of AI-support in interpreting, I would like to make another example. Think about the challenges posed by the increasing reduction of time-to-event. Our world is caged in this paradigm of velocity. Professions need to adapt to these changes. This is nothing new. In the past, the time that occurred from booking an interpreter and the event was exceptionally long. In the last 10 years or so this has become shorter. Remote conferencing is accelerating this even further. In dialogic interpreting, there are companies that offer telephone/video interpreting for police, hospitals, and so on, and this service has to be very fast. To cope with this reduction of preparation time, AI may offer some solutions, for example, collecting and preparing information that is needed by the interpreter to perform well in a specific domain. The human-machine interpreting can help interpreters to achieve this.
It is very interesting to reflect upon how AI can be effective in enhancing computer-assisted interpreting. If we want to look at interpreting under the impact of the emerging augmented paradigm through a different lens, we might say with the dawn of society-machine interaction, there is technology and there is interpreting. Clearly, technology has no ethics, but these two have a fundamentally ethical aspect to them. So, another field of study socially relevant for interpreting studies is digital ethics. How might insights from this field shape the social and moral existence of the augmented interpreter?
That is another particularly good question. Ethics is a big topic in AI right now, and it is going to be even more so in the future. Ethics is important in every walk of life, of course. However, think about social media and the manipulation of narratives: because of the novelty of AI and because of the power that AI can have on our society, it is paramount to reflect on and mitigate possible negative implications of AI. In interpreting I do not see anyone, especially scholars, discussing the ethical implications of using AI as a support for professional interpreters, or as an automatic interpreting tool. I would argue that this is a problem because understanding how technology may influence society or a profession, interpreting in our case, is something that requires a lot of thinking, discussion, and exchange of different opinions. It is something that cannot be improvised if you do not want to fall into stereotypes.
While ethics is a topic worth exploring, we need to bear in mind that ethical questions are not only limited to AI in interpreting, but also to human interpreting. For example, it is common to have an international meeting where only a couple of languages are covered by human interpreters, because you cannot have it in a hundred languages. It is just impossible. It has never been like this, and it will never be like this. You end up covering only the most important languages. But this choice is posing great ethical issues. What are important languages? Why is Italian, my mother tongue, rarely interpreted in international events? In 99% of the times, I am forced to use a language I do not master as well as my own, while other foreign delegates can use their native language through interpreting because they have the luck of speaking what is considered an important language. Is this not an act of power? Or even discrimination? I fear that interpreters will start pointing to these dilemmas only when AI comes into play.
Going back to AI, we need to make a fundamental distinction. First, we have intrinsic issues in AI. Just think about the gender and racial bias of big language models. As far as human-machine interaction is concerned, where we have AI in the augmented fashion, these issues become secondary because the human interpreter can mitigate potential shortcomings. This is however quite different in machine interpreting when AI becomes an agent of multilingual communication. Without any filtering activity performed by humans, in our case interpreters, such intrinsic issues may become detrimental. While it is true that bias in language models can be fixed, interpreters should really engage with these and similar ethical topics. They may become arguments to explain to clients why they should prefer a human over a machine.
Second, what does the use of AI do to interpreters and the profession? AI may have a significant impact on interpreters themselves, and this means on their self-perception. What does it mean for a profession when AI becomes an agent invading her space? Where is the boundary between the interpreter deciding how to use AI and when AI dictates how the interpreter should work? It suffices to think about the relationship between translators and post-editing to have real-life example. This is a precarious balance, and it poses ethical questions also to people like me that design such tools: where is the place of AI in human-machine interaction and where is the place of the human in this interaction?
Another interesting question has to do with the social aspects of AI. What happens to how society perceives interpreters? Chances are that the role of interpreter may be perceived as less relevant than today because of technology. It does not matter if this is true or not. Narratives and facts are not always the same. What happens if society starts perceiving interpreters somehow as a supporting tool for AI and not the other way around? Note that this is not the only possible outcome. It can turn out to be exactly the opposite, with AI fostering a bigger appreciation for human interpreters. What I fear is that the profession, especially academia, will wait too long to start a real discussion and will act only retroactively. I get criticised very often when I say this, but this is what happened with remote interpreting in the past. The community, the professional associations, etc. have considered remote interpreting as something which would never become a reality at scale. There are videos, declarations, commentaries out there just as old as a couple of years that are fun to read or watch from today’s perspective. Most people prefer to hide their head under the sand and do not look at things before they happen. But then it is too late. You need this capacity to look into the future and foresee what is going to happen. For interpreting it is not that difficult, because you have translation as a footprint. So, you can look at the evolution of translation. Of course, interpreting is quite different, but the evolutionary patterns are similar. Most importantly, you should start discussing these changes in a profoundly serious, not ideological way before they happen. If you do it at a later stage, it is like running after a train that is leaving the station. You may catch it or not. Sometimes you are forced to take the next train. Sometimes there is no other train leaving the station. This is happening again with AI. AI is coming both for interpreter-machine interaction and, very importantly, as an agent of automation in multilingual communication. Only few people take this seriously. This is a problem in my eyes. I see many chances and opportunities coming from these technologies, both for the community to thrive and for society to profit from it. But an open, founded, informed discussion has still to happen.
In your work (Fantinuoli, 2019), in addition to AI, you highlight two other areas key to the technological turn in interpreting: They are remote interpreting and computer-assisted interpreting. You also open up the possibility of these areas integrating into each other. Do you think what will conceptualise and unite them may be the augmented paradigm?
Yes, absolutely. They are going to become one. There is no doubt about this. It is just the natural evolution of technology. You have different areas of development. They happen separately, but they will converge at some point. And their convergence will be the essence of the augmented paradigm. A paradigm that will become as natural as many other things that we take for granted now. Nobody questions anymore why people do not go to a library to prepare for an assignment, for example. People just go online, Google here and there, find documents, and prepare. The same will happen with remote interpreting, which will not be the only form of delivery, of course, but it will become as normal as on-site work. Nobody will question why that event is happening online. In many cases, it will be the contrary: Why resort to ancient approaches, fly in people from different countries, increase the ecological footprint, only for a short meeting? The same will apply to augmented interpreting which will converge into remote and, I think, also into on-site interpreting. On top of this, you will also see machine interpreting becoming integrated in the same communication space. And everything will coexist. We will see the same event taking place in more than one language, we will have human interpreters working in real-time, interpreters will be supported by AI, and finally we will also have, in parallel, the full automation of the translation, for example, for the languages not served by the humans. This is fascinating.
The fact that we will have these three technologies (remote conferencing, computer-assisted interpreting tools, and machine interpreting) in the same place and at the same time will open important questions. For example, if you see that the British delegation is listening or reading the translation done completely by the machine, while the French is using human translation, one question that arises will be why do we need humans at all? If somebody is assuming that the machine is performing the task in a more or less acceptable way in one language, why do you want humans at all for another language or for a different event? We have no answers to this or related questions at the moment. Tentative explanations you can hear nowadays, for example that empathy or emotions can be processed exclusively by humans, are very weak. I am not against them. I do believe in fact that there is some partial truth in them, but they are useless to make a point. The profession, on one side, and users, on the other, need convincing arguments to understand where the differences are. Look at machine translation. In some cases, machine translation delivers particularly good translations. Still there are many cases where it is obvious you do not want machine translation, at least you do not want to rely 100% on a machine. But while written human and machine translation have a long tradition of discussion and analysis, we are not talking about it in interpreting. What will people answer to the many questions arising in a couple of years when human interpreters are assisted by machines, and they are working along with machine interpreting? If we are unprepared to give good answers, we will resort to the usual clichés. At the moment, we have not even started to ask the right questions.
That’s right. How will the upcoming augmented paradigm impact training curricula? How might they be able to adapt?
It is difficult to answer this question because there are several schools of thought and differences among countries. Everybody is approaching this differently. Some training institutions seem not to consider this as a topic worth pursuing, while others are integrating it massively in their curricula. Integrating technology in a curriculum is very difficult because you have many, often diverging, objectives during training. For me, there is no doubt that technology is only a secondary aspect of interpreting training. Interestingly enough, the more technologized we become, the more important the human-centric aspects of training become. That said, what I am trying to advocate for is that training in language technologies should not be reduced only to acquisition of practical skills. Training should raise awareness about the fact that interpreting is and will be driven by technology, and what the consequences are. It is not about knowing how to use computer-assisted interpreting tools, or how to use a specific remote interpreting application. It is about being aware of the opportunities and challenges that technology brings with it. At the University of Mainz, where I teach, I am trying to persuade stakeholders that we don’t need to train young students only in how to use a particular technology. This is very basic knowledge. Download the tool, read the handbook, and use it. Not only is it super easy to do, but also technology is changing constantly, so what you learn today will be old the next year. I am not sure if people agree with me on this, but I am really convinced that it is much more important that people understand how technology works, not how a particular application is used. For example, people have limited knowledge or even wrong assumptions around AI, machine learning, and all this stuff. Their knowledge is driven by media, companies, or the hype around it. I would like people to dive into this topic because it is so crucial for our society, and thus for interpreting, and make them understand what machine translation and speech translation are and how they work, what are the ethical issues around machine learning, what happens with our data, and so forth. I would like people to learn some programming too. Because this is the language of our time, and if you want to reign over it, you need to understand it through experience. I am not calling for interpreters to be technologists. I am calling interpreters to have sound knowledge about technology. I am shocked to see training institutions, especially at university level, having a mere utilitarian approach to education and training, especially when it comes to modern technologies.
Does this mean that interdisciplinary or transdisciplinary curricula are perhaps best suited to support or add value to the training of the augmented interpreter?
Yes, absolutely. Again, I cannot speak about every university around the world since I have a very limited European-centric view. While interpreting is by definition inter- and transdisciplinary, training in and outside universities seems to follow a utilitarian approach with a very limited scope. Students spend several years drilling interpreting, and every ancillary activity is functional to this practice. This circular thinking is a big problem, especially in a world where complexity is becoming the new paradigm. While our society is becoming more complex, professions are becoming more complex too, and therefore we need an educational approach that has complexity at its heart. This requires an interdisciplinarity mind-set. When people in charge of interpreting curricula see the need for interdisciplinarity, they usually limit it to the study of the basics of institutional history, literature, cultural studies, etc., which I obviously think are important. But I am going beyond this, because I really think interpreting and translation need multidisciplinarity that goes beyond the humanities. I would like interpreting to have classes on AI in informatics departments, classes on the history of technological evolution in engineering departments. The same applies to economics and so forth. And, importantly, this knowledge should not be utilitarian, that is, designed to serve directly the interpreting activity. It should be genuine and profound. But universities have unfortunately embraced a utilitarian approach. They focus on training experts in a very specific and narrow field. I think this is an error. We need people to be able to look at the world and their profession in a 360 degree perspective.
How are research and practice domains progressing in terms of exploration and use of CAI tools? Are they making this progress along different speeds?
We are making some progress, but unfortunately the domain of CAI tools is covered by a very small number of people, concentrated in Europe and in Asia. The big issue is that because interpreting is so small as a profession and as a discipline, we are doing things on a small scale and what we learn is limited. The situation is even worse in the case of machine interpreting, where interpreting studies are completely absent from the scene. Research is done only from the computer science perspective, but we miss the perspective of human-centric evaluation, etc. What strikes me is that there is a big gap between the impact that technology is having on interpreting and the interest shown by scholars so far. I see many efforts in interpreting studies in a variety of interesting areas, especially in the socio-cultural aspects of interpreting. However, given the big impact that technology is having on interpreting and the fact that we are still at the beginning of this phase, I would expect more engagement from the community in studying, understanding and also giving directions to the future of interpreting. At the end of the day, we all want interpreters to be happy and a society successfully navigating a multilingual world. I think we are missing a lot of opportunities.
Thank you very much for your time and for this interview.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
