Abstract
Exploring people’s attitudes toward the appearance design of social robots in a low-cost and efficient way, and enhancing the experience of human–robot interaction have always been topics of concern for robot developers and interaction designers. This study aimed to explore the influence of the baby schema effect on users’ perceptions of cuteness and trustworthiness pertinent to social robot faces through two experiments. Experiment 1 used 100 uniformly processed pictures of robot faces in the real world to help explore the linear relationship among the degree of baby face, cuteness, and trustworthiness, and received a total of 98 valid questionnaires via the Internet. Experiment 2 was a 5 × 3 within-subjects factorial design. The research variables were robot type (i.e. MAKI, RoboThespian, Flobi, Pepper, and iCat) and baby schema (low schema, uncontrolled, and high schema); their impact on users’ perceptions of cuteness and trustworthiness was investigated. A total of 175 valid questionnaires were collected via the Internet. The generated results are as follows: (1) The degree of baby face and perceived emotion of social robot faces had a positive impact on trustworthiness for most real-world robots. (2) This study obtained the correlation formula of baby face, cuteness, and trustworthiness from a quantitative point of view, thus providing a reference for research on the related credibility of communication robots. (3) In general, baby schema effect also existed in the cuteness evaluation of most real-world robots. Faces with high schema were considered cuter and more trustworthy than uncontrolled or low schema faces. (4) Robot type and baby schema had a significant interaction with cuteness and trustworthiness. (5) However, for certain types of robots, baby schema effect may also have a counter-effect, that is, an overly high baby schema may reduce users’ perceptions of the cuteness and trustworthiness of social robots.
Keywords
Introduction
Social robots in the modern era have gradually entered people’s lives as consumer products or life partners. In human–robot interaction (HRI), research has shown that making robots appear to interact with people in a manner similar to human behavior, or having a human-like appearance (such as having a human profile or facial features) can increase people’s feelings of trust and can thus improve the HRI experience. 1 –4 It is known that there are many factors that directly affect people’s attitudes toward social robots.
The concept of “baby schema,” proposed by Lorenz, posits that babies and children have a specific infantile appearance that makes adults instinctively desire to protect and nurture them and to perceive them as more naive, honest, and kind. 5 –8 Furthermore, the baby schema effect (BSE) was found to be applicable not only to human baby faces but also to baby animals, 9 dolls, 10 cartoon characters, 11 product design, 12 –16 and even social robots. 12,17 –21 In robot design, due to different use scenarios, robots play different roles and can exhibit different appearances and reactions. People’s impressions of their cuteness and trustworthiness are therefore bound to differ. Different types of robots have different applicable scenarios or groups, such as family environments 20 or medical care for the elderly. 22
Trustworthiness is the cornerstone of interaction and cooperation in human society, and it is the same in HRI. People are accustomed to using human social principles to interact with robots. 23 In the appearance design of social robots, making robots look cuter and more trustworthy can also improve the HRI experience. 19,24 –28 For example, Natarajan and Gombolay found that behavior and anthropomorphism of the agent are the most significant factors in predicting trust in and compliance with the robot. 26 BSE can provide people with a cute and innocent impression, thereby enhancing their trustworthiness. In the field of robot design, this theory has also been studied or applied many times. 19 –22,29 –32 Some scholars have tried to apply BSE to the application practice of social robots, 30 and some have conducted a series of feasibility studies using related concepts in specific fields such as the medical service industry. 31
However, the current research on the influence of baby schema on robot facial impressions is still limited and scattered. In particular, there is no BSE-related demonstration for a large range of real-world robots in the existing research. There are also no relevant research results related to BSE specifically targeting the variable “robot type.” The issues of concern need to be further discussed and explored.
Baby schema effect
Cuteness occupies both the aesthetic and affective realms, which includes a set of visual and/or behavioral traits that touch the physical and emotional responses in human minds: what we might call the “Aww” factor. 33 In Japanese culture, “kawaii” is a catchphrase that is often translated as “cuteness” in English. 34 Because the cute things evoke feelings of caring and nurture, many private corporations have invested in the creation of cute merchandise that is based on the aesthetic qualities of kawaii/cuteness, such as infantile features, bright colors, and soft textures. 35 Because of being so cute, people want to squeeze or bite cute things without causing them harm. This urge is called “cute aggression.” 36 However, when it comes to cuteness, one has to mention an important effect, that is, the BSE.
The concept of the Kinden schema or “baby schema” proposed by Lorenz 5 is considered to be an innate releasing mechanism for humans. That is, babies and children have specific infantile appearance features (i.e. big eyes, a round face, a small chin, raised cheeks, and other facial features which make their faces look lovelier than adult faces) that serve as innate releasing mechanisms in adults to protect and nurture them. 6,8 Therefore, people tend to instinctively bias their behavior 37 and think that they are more naive, honest, and kind. 38 Glocker et al. conducted a quantitative study by controlling the proportion of facial features in baby pictures in a parametric way and found that participants believed that a high schema (i.e. round face, long forehead) might look better than a low schema (i.e. narrow face, short forehead), and a high schema was cuter and more easily resulted in a stronger willingness to take care of it. 39 In addition, some studies have also shown that the baby schema is not limited to human faces. It is also applicable to certain animals, like kittens and puppies. 40 Other studies have also confirmed that there is a significant gender difference in the way people respond to BSE. More specifically, women seem to be more sensitive and respond more intensely than men, 9,39,41,42 and gender differences may change with the age of the observer. 43
So far, there have been numerous real-world applications of BSE. For example, in a public relations crisis, the baby face of the CEO may affect the judgment of consumers. 44 BSE can also influence the appearance of cartoon character design, 14,45 such as Mickey Mouse, Sonic the Hedgehog, 11 and teddy bears. 10 In the field of marketing and brand design 46 as well as product design, 13,15,16 BSE has been shown to help increase the attractiveness of brand logos or product designs to a certain extent. Miesler and his colleges manipulated car fronts—and faces as controls—in accordance with the baby schema and combined facial electromyography with cuteness ratings to assess innate affective responses. It turned out that people showed more positive and affective responses regarding the front face of cars with a higher degree of baby face. The results confirmed that consumers’ affective responses to visual product designs are affected by evolutionarily implemented features (like people’s evolutionarily adaptive response to baby features). 15
BSE and the trustworthiness of social robots
As a growing number of social robots enter people’s lives, making robots look friendlier and more trustworthy is one of the important goals of designers. When social robots first appear in front of people, the first impression they make with their appearance greatly affects people’s attitudes toward the robots and the effectiveness of their interactions. The head and face of robots in particular can directly reflect the state or expression of the robot, giving people a preliminary feeling of believability or untrustworthiness. On the one hand, anthropomorphic—human like—characteristics seem to be critical components for consumers’ acceptance of robotic service, 18 but designers have also tried to avoid making robots fall into the uncanny valley (UV) effect zone. 18,47,48 It has been revealed that the closer a robot’s appearance and behavior are to humans, the easier it is for people to have positive emotions; however, after this positive emotion reaches a peak point, as the similarity increases, people will become terrified of robots. This is the so-called “UV effect.” When the similarity continues to rise to a level closer to that of humans, people will regenerate positive emotions toward robots. 47 For instance, abstract robot faces seem to be more common among robots already on the market. 49,50 On the other hand, if the robot shows cordial and friendly social cues, it will enter people’s lives more easily. 51 It seems that the robot does not necessarily have to be very similar in appearance to humans. Appropriate exaggeration or making the robot “look less like a human” seems to improve the trustworthiness of the social robot. In this regard, whether the BSE is applicable to robot design and makes the robot look more sincere and trustworthy is worth further exploration.
Some scholars have already used the related concepts of infant schema in robot design, such as the robot Flobi which was designed to look like a little girl, 30 in addition to the appearance of some animal-shaped robots 20 and some practical applications in the development of robots. 52 Some scholars believe that the appearance of a robot affects the impact of its behavior on users’ perceived trustworthiness and empathy, and more machinelike robots can be more suitable as companions than highly humanlike robots. 37 Powers and Kiesler 19 controlled the proportion of physical features of the robot’s head and found that participants were more willing to accept robots with baby-face features, that is, with big eyes and a short chin. In fact, the cute aesthetic has become one of the most pervasive aesthetics of commodity capitalism in the digital age, making objects more adoptable. Nonetheless, this effect also incapacitates robots in some way as it has the ability to call up a range of negative effects. 53 Lukin and his colleagues explored whether cuteness affects moral judgments on the robot’s actions through two studies. Study 1 found that in the situation of forced medication, there is a difference in how human and robot nurses’ actions are judged. Study 2 was an experimental research inquiry exploring whether cuteness affects moral judgments of the robot’s actions. 7 Although the experiment did not show that the cute appearance of the robot would directly affect people’s moral judgment of it, there are quite strong theories that show that a cute and beautiful appearance may affect the evaluation and judgment of the agent. Therefore, more research studies are needed to explore this type of social interaction, such as Lukin’s work. 31 Yao Song and Yan Luximon 54 summarized the research trend on specific human, product, or robot facial anthropomorphic trustworthy features, dividing it into four streams: internal, external, combinations, and emotions, including baby face (cuteness). It seems that some scholars have already provided evidence that BSE has a positive effect on the trustworthiness of social robots. 21 In addition, robot type is a significant and strong predictor of cuteness, different types of robots and their interactive performance, such as head tilt. 32 (Robot type refers to the different manifestations of robots in appearance and some behaviors, such as humanoid robots and android robots.)
The importance of trustworthiness in HRI has attracted the attention of many scholars, 21,33,55 and the design space for the appearance of robots is huge. Yao Song and Yan Luximon systematically reviewed, evaluated, and summarized static facial features, dynamic features, their combinations, and related emotional expressions in their further exploration of facial anthropomorphic trustworthiness for social robot design. 21 To explore the effects of different combinations of baby schema facial features, especially the positions and sizes of the eyes and mouth, on facial anthropomorphic trustworthiness, they conducted a five-way mixed experiment (N = 270). The results indicated that people would experience a high level of facial anthropomorphic trustworthiness toward robots with baby schema features. 55 Złotowski et al. 56 discussed the role of robot appearance and behavior in repetitive interactions on key factors of companion robots, that is, perceived empathy, trustworthiness, and anxiety. They found that highly humanoid robots were perceived as less trustworthy and compassionate than machinelike robots with certain humanoid characteristics. 33 This also brings about the corresponding problem, that is, how to accurately and objectively measure people’s feelings of trustworthiness of robots is important to designers. Schaefer 27,28 aimed to establish a reliable and effective subjective scale to specifically measure human–machine trust that can be applied to multiple robotic fields, developing a 40-item trust scale to provide more sensitive and accurate trust scores which can be used in many robot applications from industry to military to everyday robots. Also, one study showed that behavior and personification are the most important factors predicting trust in and obedience to robots. 26 Jian et al. (2000) argued that the concepts of general trust, human–human trust, and human–machine trust tend to be similar, and these construct a proposed scale to help measure trust in human–machine systems. 57 Mathur and Reichling used methods from game theory to measure participants’ actual willingness to trust robots in a game with real financial consequences. In addition, robot pictures can be used to quickly test the target population’s response to certain traits, such as humanoidness. 25,55
However, although there have been some research results regarding the trustworthiness or BSE of social robots, there are still a series of problems that have not been resolved. Firstly, at present, there are few theoretical studies and findings on BSE on social robots. The relevant research only focuses on one to two specific or synthetic robots rather than on physical robots. Such results are obviously not universal. Secondly, there are many types of social robots available on the market, and the scope of design is huge. There are already some signs that robot type seems to be an important factor affecting the cuteness and trustworthiness of social robots. However, there still exists no significant evidence to explicitly explain the effect of robot type on the cuteness and trustworthiness of social robots. Thirdly, there is no clear conclusion in the discussion of the relationship between cuteness and trust. Last but not least, the facial expressions of robots or the emotions conveyed in the natural state also seem to have a huge impact on cuteness or trustworthiness, and these assumptions have not yet been clearly explained. To obtain a more general conclusion about the trustworthiness of robot faces, it seems difficult to only rely on one or two specific robot faces under investigation. In previous studies, animals or humans with high baby schema were considered to be cuter and more trustworthy. In robot design, the BSE on the impression of the cuteness of the robot’s face, and in particular, on trustworthiness, as well as the relevance of some similar factors such as robot types need to be further explored.
Based on the above, this study proposed the following hypotheses:
Experiment 1: Relationship between baby face, cuteness, and trustworthiness in wild-type robot faces
The purpose of Experiment 1 was to determine: (1) whether there is a positive correlation among the degree of baby face, cuteness, and trustworthiness; (2) whether the emotion of the robot face perceived by the participants may affect the evaluation of the abovementioned factors; and (3) whether there is a linear regression among all of these factors.
Stimuli
Experiment 1 used 100 real-world robot faces taken from Google Images. Because the baby schema is related to multiple features of the entire face, when selecting pictures, only robots with obvious facial features and complete head features were selected. In addition, the robots must exist in real life, whether as experimental robots for research or consumer robots that already serve people; conceptual robots were thus excluded from this study. We adopted “social robot,” “robot head,” “robot,” and “humanoid robot” as keywords to filter pictures in the search engine within 1 day, and only the first picture of the same robot model was selected.
25
The search criteria were as follows: The robots had obvious head and facial features. The robot’s head was fully displayed from the top of the head to the chin. The robot’s face could not be turned more than three-fourth from the central face image. The selected robot existed in reality. Robots could not be marked as a toy (to rule out some toys that just look like robots, as these toys usually do not have the ability to work semiautonomously or autonomously). Robot pictures could cover a wide range of types, including humanoid robots, animal-shaped robots, or mechanical-like robots. Pictures should not include a watermark or text overlay. The resolution of the robot picture could not be less than 600 × 600 pixels.
We accepted the first 100 face pictures satisfying the inclusion criteria, and the picture retouching software Adobe Photoshop was used to uniformly process the pictures. The background of the robot pictures was unified to white, with a resolution of 600 × 600 pixels, and they were numbered separately. One picture was not included because of the attention test (see Figure 1).

Wild-type robot face stimuli (Experiment 1) numbered and displayed in ascending order of cuteness score (from lowest to highest).
Participants and procedures
A total of 98 participants were recruited for Experiment 1. The basic information of the participants was as follows: The proportion of female and male participants was 38.72% and 61.28%, 77% of the participants had a university degree or above (18.72% had some college education and 3.4% were high school graduates or lower), most were aged 26–30 years old (31–40: 27.78%, 41 or above: 24.36%), and most had experience in interacting with robots (0–1 year: 42.55%, 1–2 years: 16.17%, 2+ years: 16.60%).
The participants of Experiment 1 were workers from Amazon Mechanical Turk. Mechanical Turk is Amazon’s crowdsourcing platform connecting online workers with people who have small tasks (called human information tasks or HITs) for people to complete online. For academic researchers, Mechanical Turk provides access to thousands of potential research study participants. Data can be collected quickly and relatively inexpensively. 58 That is, participants were recruited via the Internet. We screened the conditions of workers, including that the HIT approval rate (%) for all requesters’ HITs was greater than 95%. The HIT approval rate is the percentage of the workers who successfully complete their jobs among all the jobs. If the questionnaire issuer finds that the questionnaire is incorrectly filled out, the worker can be rejected for that job. The number of HITs approved was greater than 50. Workers were anonymous during the questionnaire filling process. After passing the questionnaire test, each person was paid a certain amount of money, and the same worker could not fill in the questionnaire repeatedly. 25 To obtain a more general and objective evaluation, we did not set any special requirements regarding the nationality of the participants.
At the beginning of the questionnaire, the participants were previewed with pictures of all robot faces, reminding them that the questionnaire was only about their subjective judgments, and there was no right or wrong answer. They could answer with confidence and it was highly suggested to use a desktop PC to complete the survey. The main body of the questionnaire was divided into three parts. First of all, the participants needed to evaluate the baby face of each robot picture shown randomly pertinent to cuteness and trustworthiness. The evaluation dimension was a 0–100 scale. 50 The question of trustworthiness was as follows: “To what extent is this robot face trustworthy? (Please rate from 0 to 100 on the scale). Trustworthiness is your first impression of the robot before interacting with it. Can you imagine, in some specific occasions, such as in a shopping mall, when you want to ask the robot for help, to what extent would you consider it credible and so ask it for help?” Next, the participants needed to score the sentiment of each robot picture shown randomly, with the evaluation dimension ranging from −100 to 100. The question of perceived emotion was as follows: “To what extent does this robot face have a perceived emotion? (Please rate from −100 to 100 on the scale). 25 Both positive and negative emotions can be scored, where the more negative emotions can get more negative scores, and the more positive emotions can get more positive scores. You can evaluate according to your first impression of the robot.” Finally, the participants were asked to fill in their basic background information, including gender, age, educational background, and robot use experience.
To avoid fatigue, and more importantly, to prevent the participants from guessing the intent of the questionnaire, we randomly assigned the content of the first part to the participant, which meant that each participant would only see information about the questionnaire for one of the three problems of baby face degree, cuteness, and trust. In addition, in the second part of the perceived emotions, each participant only needed to evaluate one-third of the 100 robot pictures. Of course, in each part of the question, the order of appearance of these pictures was presented randomly. Questions in the questionnaire were asked in the form of “To what extent does this robot have a baby face? (Please rate from 0 to 100 on the scale).”
In addition, to prevent the participants from being distracted by filling in too many times or grading without reading carefully, we added an attention check (i.e. we set the score of the 84th robot face at exactly 89 on the scale) and screened out the questionnaire results that passed the conditions. 25
Results
A total of 99 robot faces were counted in the data analysis; one was not counted because it was used for the attention test. The mean scores of baby face, cuteness, trustworthiness, and perceived emotion of each face were calculated using descriptive statistics, and the SPSS 26.0 statistical software was used to perform the Pearson correlation coefficient test (see Table 1).
The Pearson correlation coefficient of four variables (baby face, trustworthiness, cuteness, and perceived emotion).
a The correlation is significant at the 0.01 level (two-tailed).
The result showed that there was a certain degree of positive correlation between the degree of baby face, cuteness, trustworthiness, and the perceived emotion of the robot faces (N = 99), and there was a moderate correlation between trustworthiness and perceived emotion (r = 0.541, N = 99), indicating that the perceived emotion had a moderate influence on the trustworthiness of the robot. In the following research (Experiment 2), the factor of perceived emotion needed to be strictly excluded. In addition, there was a significant positive correlation among other factors, and the correlation coefficients from large to small were as follows: baby face and trustworthiness (r = 0.481, N = 99, p < 0.01), baby face and perceived emotion (r = 0.427, N = 99, p < 0.01), perceived emotion and cuteness (r = 0.367, N = 99, p < 0.01), baby face and cuteness (r = 0.358, N = 99, p < 0.01), and cuteness and trustworthiness (r = 0.340, N = 99, p < 0.01).
We further performed multiple linear regression analysis on the abovementioned variables using trustworthiness as a dependent variable and the remaining three variables as potential independent variables. The degree of fit had an adjusted R 2 value of 0.357. The variables with significance were baby face (p = 0.003 < 0.05) and perceived emotion (p = 0.000 < 0.05) (see Table 2). The regression coefficients of cuteness and trustworthiness were both positive, which meant that both variables would significantly positively affect the trustworthiness of the robot faces.
The regression coefficient table of the dependent variable (trustworthiness).
VIF: variance inflation factor.
There was no multicollinearity between the independent variables of the linear regression model (the VIF of baby face = 1.298 < 5, the VIF of perceived emotion = 1.307 < 5). The standardized residuals obeyed normal distribution (see Figure 2), and the Durbin–Watson value was 2.113.

Normalized P–P plot of standardized residuals.
Based on the above analysis, the quantitative relationship (i.e. regression equation) between the trustworthiness and baby face of the robot faces and the perceived emotion of the robot faces is as follows:
Experiment 2: Influence of robot type and baby schema on the cuteness and trustworthiness in controlled robot faces
The purpose of Experiment 2 was to determine: (1) whether there was a significant main effect of robot type and baby schema on the cuteness and trustworthiness of social robot faces and (2) whether there was a significant interaction between robot type and baby schema on the cuteness and trustworthiness of the social robot faces.
Stimuli
In Experiment 2, five robot faces from Experiment 1 were chosen for further analysis. They were MAKI (an emotive robot capable of interacting with humans) (https://www.hello-robo.com/), RoboThespian (developed for interaction and communication with people in public environments), 59 Flobi (based on the appearance of a little girl), 30 Pepper (a humanoid robot which was developed by SoftBank Robotics), 60 and iCat (a desktop user-interface robot with mechanically rendered facial expressions). 52 They were chosen because they had different appearance types. In addition, there was no significant difference between their perceived emotion scores in Experiment 1, which was a good way to rule out the evaluation interference caused by perceived emotion.
Because the degree of baby schema was related to a total of six parameters (see Table 3) in this stage of the research, we first selected the existing robot pictures with obvious facial features as the reference for making the experimental samples. Secondly, the height of each robot head was uniformly set to 600 pixels, and the aspect ratio of the robot head remained unchanged when scaling.
The description of the control process of the baby schema. 39
Adobe Photoshop 2020 was adopted to help process the experimental pictures to make them conform to the ratio of different treatments. We first marked the measurement points of the sample faces using a coordinate system to superimpose on the face so that the horizontal axis was connected to the inner corner of the eye, and the vertical axis passed through the centerline of the nose. The facial measurements were obtained by measuring the distance between the following landmarks: A (top of the head), B (bottom of the chin), C and D (outer edge of the face along the horizontal axis), E1 and E2 (inner corners of the eyes), F1 and F2 (the outer corners of the eyes), O (the base of the nose at the intersection of the horizontal and vertical axes), H (below the tip of the nose), I and J (the widest point of the nose), and K and L (outside the mouth). 40 Then, according to the six parameter change principles related to the baby schema 39 (see Table 3), the robot facial feature ratios were adjusted separately, and the adjustment range was controlled between 5% and 20% of the original facial parameters (see Table 4).
Criterion of measurement pictures in Experiment 2.
To ensure a consistent increase or decrease ratio between the same parameters, unrelated facial features such as eyebrows and eye distance were fine-tuned to a certain extent based on the changes in the main features, so that the adjusted faces looked more coordinated. Finally, a total of 15 face sample pictures of the five robots were obtained through the adjustment of facial parameters. The robot pictures of the uncontrolled face were the original images of the robot. They had not been processed by the software, and the parameter adjustment of the high or low schema was based on the proportional adjustment of the uncontrolled face. The backgrounds were unified to the same background color, and the size of the pictures was unified as 800 × 800 pixels.
In particular, to avoid interference caused by other factors as much as possible, we uniformly adjusted the color of the robot’s irises to blue (see Table 5). This is because the iris of several robots did not have a special color, but some were blue. By so doing, we could avoid the interference of participants’ subjective judgment of the eye iris color on the evaluation of facial preference.
Sample robot face pictures adopted in Experiment 2.
Experimental participants and procedures
The experiment was carried out using a 5 × 3 within-subject factor design, and the experimental factors were robot type (MAKI, RoboThespian, Flobi, Pepper, and iCat) and baby schema (low schema, uncontrolled, and high schema).
This experiment was carried out in the form of an online questionnaire via Amazon Mechanical Turk. The relevant worker screening conditions were the same as for Experiment 1. The participants were asked to observe 15 pictures of robot faces and then answered the corresponding questions. The scoring dimensions were taken and modified from the Godspeed questionnaire, 60 including five items of “dislike–like,” “unfriendly–friendly,” “unkind–kind,” “unpleasant–pleasant,” and “awful–nice” to obtain the evaluation on the robot’s cuteness impression through the average score of each item. In addition, the “untrustworthy–trustworthy” and “optimistic–pessimistic” items were added using a semantic difference scale of 1–5, in which the optimistic–pessimistic option was used to conceal the research intent. Each picture and the question options were presented in random order.
Results
A total of 175 valid questionnaires (44 were excluded because they failed the attention test) were collected in the experiment. The data were analyzed by analysis of variance (ANOVA) using SPSS 26.0, and the significant value α was set to 0.05.
As shown in Table 6, the differences in the cuteness of the robots’ faces were analyzed regarding the robot type and baby schema. The ANOVA results showed that there was a significant main effect on the cuteness impression scores of robot type (F = 33.075, p = 0.000 < 0.01) and baby schema (F = 3.365, p = 0.036 < 0.05). The post hoc comparison found that the participants thought that the robot Pepper (M = 5.29, SE = 0.07) was significantly cuter than MAKI (M = 4.84, SE = 0.08), Flobi (M = 4.46, SE = 0.10), and RoboThespian (M = 4.25, SE = 0.10). At the same time, Pepper (M = 5.25, SE = 0.07) was significantly cuter than iCat (M = 4.77, SE = 0.10), Flobi, and RoboThespian. There was no significant difference between MAKI and iCat. In addition, the main effect of baby schema also revealed significant differences. The cuteness score of the high baby schema robot face (M = 4.77, SE = 0.07) was significantly higher than the low baby schema robot face (M = 4.67, SE = 0.07). There was no significant difference between high baby schema and uncontrolled baby schema, or between the uncontrolled baby schema and low baby schema.
Two-factor ANOVA analysis table of the “cuteness” score.
ANOVA: analysis of variance; Rt: robot type; Bs: baby schema; SS: sum of squares; MS: mean squares; df: degree of freedom.
b Significantly different at α = 0.01.
There was a significant interaction effect between robot type and baby schema on cuteness (F = 3.544, p = 0.000 < 0.05). The results indicated that the high schema of the MAKI robot (M = 4.91) was considered cuter than the low schema (M = 4.75) and the uncontrolled one (M = 4.84). The high schema of the RoboThespian robot (M = 4.27) and the uncontrolled one (M = 4.27) were considered cuter than the low schema robot face (M = 4.20). The uncontrolled schema of the Flobi robot (M = 4.61) was considered cuter than the low schema face (M = 4.43) and the high schema face (M = 4.35). The high schema of the Pepper robot (M = 5.43) was considered cuter than the uncontrolled schema face (M = 5.27) and the low schema face (M = 5.12). The result for the iCat robot was the same as that for Pepper, where the high schema face (M = 4.87) was considered cuter than the uncontrolled schema face (M = 4.73) and the low schema face (M = 4.72) (see Figure 3).

The interaction diagram of robot type and baby schema on the cuteness score.
As shown in Table 7, the main effect of robot type showed a significant difference in the trustworthiness impression scores (F = 18.801, p = 0.000 < 0.01). The post hoc comparison found that the robot Pepper (M = 5.21, SE = 0.07) was significantly more trustworthy than MAKI (M = 4.86, SE = 0.09), Flobi (M = 4.48, SE = 0.10), and RoboThespian (M = 4.46, SE = 0.10); at the same time, Pepper (M = 5.21, SE = 0.07) was significantly more trustworthy than iCat (M = 4.78, SE = 0.10), Flobi, and RoboThespian. There existed no significant difference between MAKI and iCat, and there was no significant difference between Flobi and RoboThespian. Moreover, the main effect of baby schema also revealed significant differences (F = 4.462, p = 0.012 < 0.01). The post hoc comparison revealed that the trustworthy score of the high baby schema score (M = 4.82, SE = 0.07) was significantly higher than the low baby schema robot face (M = 4.68, SE = 0.08), and the uncontrolled robot face (M = 4.78, SE = 0.07) was significantly higher than the low baby schema robot face (M = 4.68, SE = 0.08).
Two-factor ANOVA analysis table of the “trustworthiness” score.
ANOVA: analysis of variance; Rt: robot type; Bs: baby schema; SS: sum of squares; MS: mean squares; df: degree of freedom.
b Significantly different at α = 0.01.
Furthermore, there existed a significant interaction effect between robot type and baby schema on trustworthiness (F = 3.424, p = 0.001 < 0.05). The results indicated that the high schema of the MAKI robot (M = 4.97) was considered more trustworthy than the low schema (M = 4.81) and the uncontrolled one (M = 4.80). The high schema of the RoboThespian robot (M = 4.54) and the uncontrolled one (M = 4.53) were considered more trustworthy than the low schema robot face (M = 4.31). The uncontrolled schema of the Flobi robot (M = 4.61) was considered more trustworthy than the low schema face (M = 4.54) and the high schema face (M = 4.29). The high schema of the Pepper robot (M = 5.54) was considered more trustworthy than the uncontrolled schema face (M = 5.17) and the low schema face (M = 5.06). The result of the iCat robot was similar to that of Pepper. The high schema face (M = 4.90) was considered more trustworthy than the uncontrolled schema face (M = 4.79) and the low schema face (M = 4.66) (see Figure 4).

The interaction diagram of robot type and baby schema on the trustworthiness score.
Table 8 summarizes the typical reasons (i.e. reasons that were mentioned two or more times) why some participants chose the facial preferences of the social robots from the text descriptions of the survey results.
Participants’ comments on the evaluation of facial preferences.
Discussion and conclusion
The effect of baby schema on cuteness and trustworthiness
The concept of baby schema is defined as babies and children who have specific infantile appearance features that serve as innate releasing mechanisms in adults to protect and nurture them. 6,8 The results of Experiment 1 accepted hypothesis 1. That is, there is a positive correlation among the real-world robots’ baby face, cuteness, and people’s perceptions of the robots’ trustworthiness.
The result of this study indicates that BSE can not only apply to humans but also fits some social robots. It is also suitable for human-shaped, animal-shaped, and even machinelike robots with clear facial features. The results confirm that the positive influence of high baby schema (i.e. the features of big eyes, round face, high forehead, small nose, small mouth, etc.) on the impression of cuteness and trustworthiness of certain robot faces to some extent exists in most real-world robots. Even if the robot face of the uncontrolled sample has been designed with reference to some baby schema theories (e.g. Flobi), and it seems to be quite different from the human face, increasing the degree of its baby schema on this basis may still trigger people’s positive evaluation. Not only for humanlike robots 1 –4 but also for animallike or even machinelike faces, which look even less like humans, it seems that the appearance of facial features can enhance their positive impression in people’s minds to a certain extent. 33 In addition, according to previous research results, the robots’ eyes have a greater influence on BSE, 21 and some studies have shown that the lack of facial features of social robots will greatly reduce participants’ positive impressions. 49,61
The cuteness and trustworthiness of different robot types
First of all, it is certain that different types of robots have a great impact on people’s perceptions of cuteness and trustworthiness of social robots. The experimental results rejected hypothesis 3. The results speculate that increasing the baby schema level can make the robot look cuter to a certain extent, but the cuteness cannot be increased infinitely. Sometimes it may have an adverse effect on some specific robots (e.g. Flobi). Since there were only five kinds of robots involved in the study, this inference needs to be further confirmed by more experiments. Some social robots look a lot like stuffed toys, 20 while others are too human. 24 It seems that some robots are very suitable for the home environment, 62 or they have certain functions, such as reminding the elderly to take medicine. 63 Due to different use scenarios, robots play different roles and could exhibit different appearances and reactions. People’s impressions of cuteness and trustworthiness are therefore bound to be different. For example, an agent’s perceived anthropomorphism may affect their trustworthiness in that agent. 26 Similarly, different types of robots and their interactive performance, such as head tilt, may affect people’s impression of them. 32
Based on the participants’ comments and feedback in Experiment 2, we have sufficient reason to speculate that when we exclude the perceived emotional factor, the more that robots resemble humans or some real animals, the less their degrees of baby schema can be increased arbitrarily. This is because it may cause discomfort similar to the UV effect. However, for some abstract robots, such as Pepper or machinelike robots, their degree of baby schema level can be appropriately exaggerated without causing extreme discomfort to people. This is because their images are not like those of naturally existing beings. Further experiments are needed to verify this speculation.
In addition, it is worth noting that the five robot images selected in Experiment 2 have their limitations. When controlling the high or low level of their respective infant schemas, the corresponding ratios were increased or decreased according to their respective original picture parameters. Therefore, this result can only be used as a reference for the relationship between robot type and infant schema variables, and cannot cover all social robot research results. The corresponding research results may change due to the appearance of the robot in some special cases.
Perceived emotion and trustworthiness
People can instinctively quickly judge the positive or negative emotions shown by other people or animals, or speculate that their emotions are happy or angry from changes in some facial features. These emotions perceived from the face also greatly affect people’s perceptions of the cuteness and trustworthiness of social robots. 25,64 The overall shape of the robot arouses any of three emotions, namely “concerned,” “enjoyable,” and “favorable,” and specific combinations of body parts evoke different emotional responses in people. 64 There are not only emotional effects, as Mathur and Reichling showed through an “investment game” experiment that subtle social cues in the appearance of a robot’s face even affected people’s judgment of the robot’s trustworthiness. 27
Experiment 1 preliminarily confirmed that perceived emotion will have an impact on the perception of trustworthiness to a certain extent. Hypothesis 2 was accepted. At the same time, both baby face and perceived emotion are correlated with credibility (as shown by the equation). This confirms, in a quantifiable form, that people’s emotional feelings about robots (even first impressions) and the degree of baby face make a difference not only in humans but also in the degree of influence of social robots. It should be emphasized that “first impressions” presupposes that participants have not seen these robots before, and a first impression constitutes looking at a neutral face. Even though people know that they are robots, there are still some subjective evaluations that treated the robot like humans, like the CASA theory. 59
The results of Experiments 1 and 2 also confirmed this point. In the first experiment, different robot facial images were considered to cause different positive or negative emotional perceptions, and these evaluations affected the respondents’ judgments of their cuteness and trust. In Experiment 2, this was further confirmed. Many people like to judge whether they are credible by the facial emotions of the robots they perceive. That is to say, in the process of HRI, the emotional transmission of robots plays an irreplaceably important role in the interactive experience. On the one hand, people need to get real-time feedback on the robot’s work status. For example, whether the user’s information is received while processing the information. On the other hand, through emotional expression, robots and people establish a kind of social cue similar to the interaction between people. 52 These situations, such as the perception of positive or negative emotions by the appearance of the robots, and the transmission of emotions in HRIs, all directly affect the results and evaluation (e.g. trustworthiness) of HRI. 65
Conclusions and future implications
Overall, there are three main contributions of this study. First of all, the two experiments were carried out using real-world robots, which gives the research results more reference value. Secondly, robot type is proposed for the first time as an important factor, and its interaction with the degree of “baby schema” affects the participants’ perceptions of robots. Finally, the study found that the effect of overly high baby schema degree will cause people’s resentment of some types of social robots. Specifically, the conclusions of this study are as follows:
(1) The degree of baby face and perceived emotion of social robot faces have a positive impact on trustworthiness in most real-world robots. (2) This study obtained the correlation formula of baby face, cuteness, and trustworthiness from a quantitative point of view, which can serve as a reference for the research on the related credibility of communication robots. (3) In general, BSE also existed in the cuteness evaluation of most real-world robots. Faces with high schema were considered cuter and more trustworthy than uncontrolled or low schema faces. (4) Robot type and baby schema had a significant interaction with cuteness and trustworthiness. (5) However, for certain types of robots, BSE may also have a counter-effect, that is, an overly high baby schema may reduce users’ perceptions of the cuteness and trustworthiness of social robots.
This study expands and verifies the positive role of BSE in real-world robots and further verifies the important influence of robot type as well as the interaction between robot type and baby schema regarding cuteness and trustworthiness. In addition, the research has laid the foundation for further exploration of the impression that different robot types will bring to people soon. These conclusions provide theoretical and practical references for robot appearance design and better interaction.
This study also has some limitations that should be noted. First of all, while adjusting several facial proportions related to the BSE, we think that the adjustment of other facial proportions of the robots may have interfered with the participants’ evaluation. Secondly, in addition to the participants’ first impressions of the robots, it is important to focus on real HRI. The robot’s facial expressions and emotional transmission are also important factors that affect the impression of cuteness and trustworthiness. In our future research, we will address this viewpoint further. There is a strong connection between the cuteness and trustworthiness of robots. In the follow-up research work, it is necessary to adopt a trust scale to measure robot trust, investigate people’s robot experience, combine more specific use scenarios of robots, and consider the differences in people’s attitudes toward different types of robots in various work environments.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
