Abstract
Social robots are increasingly used within public spaces, including museum settings. This quasi-systematic review identifies and synthesizes the evidence on social robots that have recently been deployed in museum settings. It specifically focuses on their intended purpose, their acceptability and factors important for successful human–robot interaction in this setting. Four databases (PsycINFO, SCOPUS, ACM Digital Library and IEEE Xplore) were systematically searched to retrieve literature published within the last 10 years on human–robot interaction studies with social robots deployed in museum settings. Due to the heterogeneous nature of the studies, qualitative and quantitative findings were summarized. A total of 604 items were identified, of which 12 were included in the review. Robots in 11 studies were physical and 1 was an embodied conversational agent presented as a virtual robot. In 75% of the studies (n = 9), the purpose of the robots was to act as museum guides, while in 17% (n = 2) they entertained visitors and in 8% (n = 1) the robot taught visitors in a museum outreach programme. Overall, many of the robots were found to be acceptable for use within museum settings. Three main themes for successful social human–robot interaction were evident across the findings: (1) facial expressions, (2) movement and (3) communication and speech. There is a great opportunity for social robots to be deployed within museum settings, as guides, educators, entertainers or a combination thereof. State-of-the-art methods have led to the development of museum robots that are more capable of social interaction; however, more work is required to develop speech capabilities that work in the ‘wild’. Future work should combine the factors that have been identified within this review to improve human–robot interaction.
Background
Social robots are commonly being developed for public spaces, including receptions and retail, 1 –3 restaurants and hospitality 4,5 and healthcare. 6 In the last 20 years, there has also been interest in deploying them in museum settings. 7 –10 Social robots are categorized as robots that have social capabilities, so can interact and assist humans, in a natural manner. This optimizes them for museum settings as they can greet, educate or provide guides to visitors.
Early and notable work in this field includes the autonomous RHINO robot, a mobile tour-guide robot implemented in Germany’s ‘Deutsches Museum’. 7 The robot played pre-recorded descriptions of the exhibitions. The robot was used for 6 days and successfully travelled 18.6 km. In total, it guided over 2000 visitors and completed a total of 2400 tour requests. Most importantly, RHINO increased overall attendance to the museum by at least 50%.
Other important work that has pioneered museum robots includes a long-term project by Willeke et al. 11 This project has led to the development of various autonomous and mobile robots including Chips, Sweetlips and Joe Historybot. The robots involved within the project have collectively achieved over 2000 operating days in total, across various museums in the United States. Their purposes are greeting visitors, giving guides and showing additional information (e.g. videos) that bring exhibitions alive.
Lastly, a similar robot Minerva provided guides to visitors to the Smithsonian’s National Museum of American History. 12 During a 2-week field trial, the robot travelled more than 44 km and interacted with over 50,000 visitors. Unlike RHINO, Minerva has a face and can display emotions using changes in vocal tonality and facial expressions. When questioned, 36.9% of 63 people perceived Minerva as having intelligence similar to humans. However, 69.8% did not perceive Minerva to be ‘alive’ suggesting that its social interactivity capabilities were still limited.
Additionally, early museum robots do not alter their interactions between users. Current state-of-the-art methods have enabled for social robots to personalize their interactions 13 –15 and interact with groups of people. 4,16 Recently developed museum robots also draw on knowledge/observations of interaction from human tour guides to use human-like gestures when interacting with visitors. 16,17 Early work is also limited in human–robot interaction studies, as there has been a focus on collecting and interpreting technical use data (e.g. operating time, number of interactions and travel distance) and determining accuracy in mobility. 7,11
To the authors’ knowledge, no reviews have been conducted to synthesize the evidence for social robots deployed in museum settings. The aim of this quasi-systematic review is therefore to identify and synthesize the evidence on social robots that have recently been deployed in museum settings. The research questions that guided the review were: What evidence is there for robots that have been deployed in museum settings in the last 10 years? What is their intended purpose and were they acceptable? What factors were important for successful human–robot interaction in this setting?
Method
We conducted a quasi-systematic review that aligns with the preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines. 18 The review is considered quasi-systematic, as unlike traditional and methodologically robust systematic reviews, we did not formally assess the quality of the studies. Logically, we did therefore not exclude literature based on quality. However, as usual in systematic reviews, our methods are replicable and have been reported in a transparent manner. 19
To answer our aforementioned research questions, we only included human–robot interaction evaluation studies or experiments. Additionally, items had to use socially interactive service robots within any museum setting and had to be published in English. We added a time frame of 10 years to our search limit (2011–2021), as this would best cover the most recent state-of-the-art methods. Studies were excluded if they were not published in English, focused on technical aspects or the design of robotic systems, and included robots that did not interact with users (i.e. had no social capabilities). Conference papers or book chapters were only included if they were full-length and provided sufficient details on the research/evaluation methods.
Literature search and screening process
We searched the PsycINFO, SCOPUS, ACM Digital Library and IEEE Xplore databases on the 24th and 25th of March, 2021, to identify relevant studies. Keywords were separated by Boolean operators (AND, OR) and broadly included “museum*” AND “robot*” AND “evaluation” OR “experiment”. The asterisk (*) symbol enabled for different variations of the keyword to be searched. For example, robot* also brings up “robots” and “robotics” as relevant keywords.
A search timeline was created on Excel, which logged the date of the search, databases searched and all of the literature identified. We then conducted a two-step screening process, after removing any duplicates between the databases. First, the title and abstracts of studies were read and assessed against the eligibility criteria. Second, the full-texts of any possibly eligible studies were downloaded from the databases and screened for eligibility.
Data extraction process
Information from the studies was extracted into a coding sheet on Excel. This included the author and year of publication, the country and setting of the study, the type of robot used, the extent of mobility, level of autonomy, purpose and functions, evaluation method and outcomes. Due to the heterogeneous nature of the studies, qualitative and quantitative findings were summarized and presented in a manner in which they would best and most clearly answer the research questions. This means that meta-analyses were not conducted.
Results
Search summary
The initial database search yielded 604 studies, with a total of 21 duplicates between the databases. Screening identified 559 studies that did not meet all of the eligibility criteria and was therefore subsequently excluded. Twenty-four full-text studies were then screened against the eligibility criteria. Half were excluded due to the following reasons: not including human–robot interaction (n = 3), focusing on technical aspects/design (n = 3), not using the robot in a museum setting (n = 2), not using robots (n = 2), the robot not having social capabilities (n = 1) and lastly, not being published in English (n = 1). In total, 12 publications were included in the review. The PRISMA 18 diagram exemplifies the search process (Figure 1).

PRISMA diagram showing search process.
Research characteristics
The 12 included studies were published from 2011 to 2019. Seven were published as conference proceedings/papers, 8,10,16,20 –23 four as journal articles 9,17,24,25 and one as a book chapter. 26
All studies collected quantitative data (e.g. surveys, coded video recordings), except for two 8,21 which collected both qualitative and quantitative data. Four were conducted in Germany, 8,9,21,23 three in Japan 16,17,25 and one in each of the following countries: United Kingdom, 10 Australia, 24 Israel, 26 South Korea 22 and the United States. 20 Most (67%, n = 8) were conducted in real-life museum settings, including science, art, technology, space, robotics and migration museums. 9,16,20 –24,26 Three (25%) were conducted in simulated museum settings in university laboratories 8,10,25 and one (8%) used both simulated and real-life settings. 17 Across the studies, a total of 2405 children and adults participated to evaluate the robots, ranging from individual sample sizes of six 17 to 1607 participants. 20 Of the studies that reported the age of their participants, the youngest was aged 7 years old 17 and the oldest 65. 8 The studies are further summarized in Table 1.
Summary of the characteristics and interventions used within the included studies.
N/S: not stated; VGA: video graphics array; GUI: graphical user interface; TTS: text-to-speech; DOF: degrees of freedom.
The robots and their purpose
All studies used physical robots, apart from Bickmore et al. 20 who used Tinker, a virtual human-sized robot (embodied conversational agent) projected onto a screen. Kim et al. 22 and Gosh and Kuzuoka 17 both used two robots (Genibo and Aibo, and Talk Torque 1 and 2, respectively). The remaining studies all used one robot. Three used NAO, 8,21,23 two used Robovie 16,25 and one each used the following robots: RoboThespian, 26 Robotinho, 9 Baxter 24 and P3 DX. 10 Figure 2 shows some of the robots that were used.

Regarding autonomy and mobility, five studies used autonomous robots, 17,20,21,23,25 two did not state the level of autonomy 16,22 and one used Wizard-of-Oz. 10 Four studies used autonomous features (e.g. face tracking) and supplemented these with Wizard-of-Oz controlled functions (e.g. speech). 8,9,24,26 For mobility, five were static but could move parts (e.g. arms, heads or facial expressions), 8,16,20,23,24 three were mobile 9,10,25 and two did not state mobility. 22,26 Two studies used two robots, of which one was mobile and the other was static. 17,21
The purpose of majority (75%, 9/12) of the robots was to act as a museum guide. 8 –10,16,17,20,21,23,25 Two (17%) provided entertainment 22,24 by playing games with visitors or dancing for them. In one study, the robot acted as an educator, teaching child visitor groups as part of the museum’s outreach programme. 26 The lesson consisted of group teaching, hands-on practical work and an assessment quiz to determine any learning improvements.
The robot guides tended to have three main functions. First, they would identify and greet guests by using facial or body detection. This was often also used to determine where the robot should be gazing throughout the interaction. For example, in Yousuf et al., 25 the robot would identify a visitor if they faced the robot for 5 s or more. The virtual robot in Bickmore et al. 20 used motion sensors to detect nearby visitors and invite them to interact. Verbally, the robot guides would greet the visitor and ask if they would like to hear some information about the exhibits (e.g. Pitsch et al. 23 ). The second function was to interact with the visitor and present the exhibits, using natural and human-like body language. This included verbal descriptions and asking questions, changing facial expressions, gesturing and pointing and adjusting posture. 8 –10,16,17,20,21,23,25,26 The cheerful robot in Velentza et al. 10 also made jokes and used humour during the tour. Lastly, the robots closed the interaction by saying goodbye or wishing visitors a pleasant remaining visit. Overall, the tours were fairly short and ranged from approximately 2–3 16,21 to 10 mins. 9
Acceptability of the robots
Overall, many of the robots were found to be acceptable for use within museum settings. Yamazaki et al. 16 found that the robot increased visitor engagement and interaction with the guide as well as interaction and engagement between the visitors. Positive reactions were also reported by visitors in other studies. 9,22,23,26 Two studies reported that children in particular rated the robots positively and often surrounded them much more closely. 9,23 Visitors in Kim et al. 22 rated the overall service quality, customer satisfaction and customer loyalty highly and perceived sociability in the robots as significantly higher after their interactions.
The robot teacher in Polishuk and Verner 26 was reported to be friendly and responsive by 81% and 73% of participants, respectively. Importantly, learning outcomes were shown by an average score of 75.2% on the quiz, highlighting that the robot teacher was effective. The lesson itself was rated as pleasant by most (87%). The robot guide in Nieuwenhuisen et al. 9 was also attentive, as rated by 72% of the children and 52% of the adults. Overall, more than 75% perceived the robot guide as friendly and polite.
Three studies reported negative findings. 22,23,26 Pitsch et al. 23 found that participants tended to prefer traditional guides and resources (e.g. human guides, pamphlets) over novel electronic displays. Kim et al. 22 also found that while negative attitudes for robots decreased for two of their validated subscales, it increased for the emotions subscale. This suggests that after the museum visit, respondents were afraid of robots with emotions. Lastly, Polishuk and Verner 26 reported some negative perceptions of their robot teacher, including limitations in making eye contact (reported by 20% of respondents) and delayed responses (reported by 21%).
Factors for successful social human–robot interaction
Across the literature, different conditions were explored to best understand what makes a social robot responsive, interactive and relational. Three main themes were evident across the findings: (1) facial expressions, (2) movement and (3) communication and speech.
Facial expressions
Two studies explored the impact of the robots face. 10,23 Pitsch et al. 23 found that flashing eyes while waiting for an interactive partner was perceived by participants as slightly more positively, than continuously lit eyes. Facial expressions also sometimes mattered. Regarding visitor enjoyment, there was no difference between a robot’s cheerful or serious face. 10 However, participants learned significantly more about the museum exhibit from the cheerful robot, compared to both robots, and just the serious one (56% vs. 38% and 37%). Using the cheerful and serious robots together was perceived as interesting and influenced experience ratings positively but resulted in less attention being paid to them.
Movement
Four studies explored movement. 8,17,23,25 It was contested as how to best engage bystanders in the tour. 8,23 This is because in Pitsch et al., 23 participants perceived the robot as more available for interaction when it waited for someone to engage with it, rather than actively looking for interactive partners. However, Gehle et al. 8 found that to engage a bystander, the robot should immediately (<2.2 s) look at, before speaking to the visitor, when they look at the robot. However, it must promptly detect gaze and body language first, to identify possibly interested visitors (rather than targeting everyone). Yousuf et al. 25 found similar results, whereby the robot that could orient its body and gaze towards visitors was more successful at inviting bystanders to interact, compared to a conventional robot (83.3% vs. 40%).
Body movement was also important throughout and at the end of the interaction. In Yousuf et al., 25 visitors were most attracted to the robot that was able to orient its body and positioning to show it was attentive (e.g. gazing towards them). As a result, they felt attracted to listen to it, involved in the tour and adequately attended to. The robot was successful at orienting its gaze and body position between 71–80% of the time. When disengaging visitors at the end of the tour, a combination of two gestures was found to be more effective than using them separately. 17 The gestures are leaning back, while delivering a verbal summative assessment.
Communication and speech
Four studies also considered the impact of communication and speech on human–robot interaction. 16,17,20,21 In a study by Gehle et al., 21 visitors responded to the robots, with 64% uttering a re-greet and 56% reacting to the farewell. Visitors best perceived robots with personalizable relational skills. This was evident in Bickmore et al., 20 whereby relational Tinker facilitated greater engagement, as measured by interaction time and the number of conversations held. Relational Tinker, compared to the non-relational condition, showed empathy, became acquainted with visitors by asking about them, referred back to information provided by visitors, used humour and also used visitor names when speaking with them. Relational Tinker not only increased engagement but also resulted in greater visitor satisfaction and liking towards the robot. Participants even reported learning more and scored their relationship as similar to that of a close friend.
Other speech-related techniques were also used to facilitate engagement and disengagement. Yamazaki et al. 16 found that when the robot created a puzzle about the museum content, visitors were more interactive. Laughter, mutual gaze and increased conversation demonstrated this. However, positive reactions were most common among those who knew the answers, thus indicating that additional clues should have been given, so all would know the answer and respond positively. Gehle et al. 21 considered the structure of questions and found that museum visitors are more likely to verbally react to the robot when it asks a closed question, compared to an open one (84% vs. 32%). The authors further compared different types of question–answer structures (see Table 1) and found that closed questions with closed answers resulted in 76% of answers, compared to 65% for open question and answers. Visitors often uttered non-understanding of the robot, made content-related requests and made requests in terms of orientation. Non-understanding was seen in multi-modal manners, including visitors rotating their upper bodies away, asking for clarification after a pause and looking at one another. If these were not resolved, they were likely to disengage from the tour. However, to intentionally disengage visitors, Ghosh et al. 17 reinforced the importance of the robot providing a summative assessment.
Discussion
This review has synthesized 12 recently published studies on human–robot interaction within museum settings. Most of the robots were physical, and in three-quarters of the studies acted as museum guides. Overall, many of the robots were found to be acceptable for use within museum settings. Three main themes for successful social human–robot interaction were evident across the findings: (1) facial expressions, (2) movement and (3) communication and speech.
Positive perceptions of the museum robots were evident, highlighting that museums could be an appropriate setting for further development and implementation of social robots. While most of the studies explored museum guide robots, other less common but novel applications include teaching and entertainment. This reflects a cultural shift towards using robots in public spaces, including in education (e.g. Henkemans et al., 13 Park et al. 32 ). There is great opportunity to combine these applications into an entertaining robot guide that also presents teaching content and quizzes visitors or provides puzzles (as in Yamazaki et al. 16 ) for them to ponder on.
The literature was limited to the last 10 years of publication, highlighting state-of-the-art methods in facial detection, appropriate body positioning and movement. 8 –10,16,17,20 –26 Despite this, some of the studies (e.g. Nieuwenhuisen et al., 9 Herath et al., 24 Polishuk and Verner 26 ) used Wizard-of-Oz methods to assist the robot in understanding language. It is worth noting that while great progress has been made regarding self-localization and mobility, speech and linguistics still requires much effort. Common challenges includes robots sounding too ‘robotic’ 33 and difficult to understand, as well as issues in understanding humans in the ‘wild’ where people may speak too quietly, have accents, speak simultaneously or present with speech impediments and differing needs.
In addition, the focus on interaction was mostly limited to facial expressions, movement and speech. Other research 34 –38 has highlighted that opportunities for personalized and effective human–robot interaction should also consider cultural norms in positioning and communication, as well as differing preferences for engagement. For example, Trovato et al. 39 developed a greeting selection model implemented on the ARMAR-IIIb robot to help it appropriately greet users from diverse cultural groups. The robot successfully changed its greeting style between participants (e.g. bow, nod, raise hand, hug or handshake). Drawing on these considerations may further enhance human–robot interaction.
It was also interesting to note that many of the robots were significantly different to earlier models. In our review, many museum robots were static, 16,20,21,23,24 placed on tables 8,21,23 and one was virtual. 20 In contrast, previous robots such as RHINO, 7 Chips, Sweetlips, Joe Historybot 11 and Minerva 12 were mobile and travelled between the exhibits. Evidently, large size and autonomous mobility is not a requirement for social museum robots and highlights that even smaller robots may be appropriate and engaging. Instead of moving great distances and navigating complicated crowds, roboticists and computer scientists can focus on designing robots that appropriately move their body parts (e.g. gaze and gestures) and have better interactive capabilities…or have no hardware at all!
Other potential avenues worth exploring are hybrid models, such as that employed in the HeritageBot project. 40 –43 While not socially expressive or previously applied to museum settings, this robot can safely navigate heritage sights through flight and walking. Experiments conducted inside and outdoors with the HeritageBot III showed a steady flight (even in mild wind) and consistent walking abilities across various terrain. 40 Hence, within closed museum spaces, the robot could for example guide people to specific exhibits, navigate crowds by flying over them or be used in emergency situations, like evacuations, to guide visitors to meeting points (or outside).
Design and research implications
The reviewed research highlights some important design implications. These are summarized in Figure 3. Specifically, we have learned that particular design decisions for a museum robot’s facial expressions, movement and speech and communication can lead to more successful human–robot interaction, including when engaging visitors to join a tour, during the tours and when attempting to disengage museum visitors at the end of a tour.

Image showing main features for designing social museum guide robots, with interaction between speech and movement components highlighted.
Facial expressions should be cheerful for optimal learning, and eyes should clearly portray when the robot is waiting for an interactive partner (e.g. by flashing). Movement while waiting for an interactive partner should also be limited but must be appropriate regarding gaze and body positioning (towards the dominant speaker and tour attendees) during interaction. Robots should also lean back while verbally ending the tour. The last implications were for speech and communication, whereby robots should be as relational as possible, use puzzles and quizzes to engage visitors during the tour and also ask closed-ended questions. Throughout the interaction, it is crucial that time is left for visitors to respond, and that the robot attempts to identify (e.g. by noticing confusion) and verbally resolve any issues in understanding.
Further research should continue to explore the impact of these design implications, especially in implementing multiple (or all) factors. This is because most of the literature reviewed has only focused on one or two of these factors. Combining these factors with general design implications for other adaptable social robots (e.g. in Gasteiger et al. 37 ) warrants further investigation. This may include changing behaviour to reflect cultural norms and adapting interactions in response to service expectations, communication styles and comfort with proxemics (i.e. positioning and distance from the robot).
It is also imperative that future research continues to be conducted in ‘the wild’, to best understand how social robots may be used in museum and public settings. This is because real-life environments may be messy and consist of interrupted, complicated or lengthy interactions between differing groups of people.
Strengths and limitations
The main strengths of this review include reducing selection bias by systematically searching for studies. In addition, the transparency of our selection, screening and data extraction process enhances the replicability of the review.
This review was limited to literature that was published within the last 10 years and was conducted in museum settings. While these limits helped to refine our search, it is likely that we have excluded some relevant work that may have been published slightly earlier or was conducted in similar public settings (e.g. art galleries). However, the inclusion of simulated museum settings (within labs) may have helped to broaden our review. A second limitation is that we did not search for or include any grey literature.
Conclusion
There is a great opportunity for social robots to be deployed within museum settings, as guides, educators, entertainers or a combination thereof. State-of-the-art methods have led to the development of museum robots that are more capable of social interaction, including in displaying expressions, fostering engagement and appropriate body and gaze positioning. Despite this, more work is required for developing speech capabilities that work in wild and real-life settings. Additionally, future work should focus on combining the factors that have been identified within this review, to improve human–robot interaction.
Footnotes
Author contributions
HSA obtained the funding and conceived the initial idea as the principal investigator of this project. NG located and synthesized the literature. NG drafted the manuscript and designed the figures. All authors discussed the results and approved the final manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Institute for Information and Communications Technology Promotion (IITP) grant funded by the Korea government (MSIP) (No. 2020-0-00842, Development of Cloud Robot Intelligence for Continual Adaptation to User Reactions in Real Service Environments).
