Abstract
By drawing from imagery and consumer choice theories, we examine how visuals of faces and social groups can foster behavioural intention and emotional effects in the case of nature tourism video marketing. We implemented an experimental study by creating three different videos with altering levels of human presence. Data were analysed using linear regression and partial least-squares path analysis methods. Unlike prior research had suggested, video content with fewer direct visuals of human beings was found to be more entertaining. Thus, scenery performed better than people as visuals in our destination marketing case. The visuals of faces and social groups had no effect on consumers’ behavioural intention or emotions, even if they did arouse feelings of social presence. Behavioural intention was mainly associated with entertainment value, while social presence had only a minor effect. Gender, nationality and outdoor activity also affected the media effects of videos, showing them to be complex and context-dependent phenomena.
Introduction
Destination image formation is believed to be important in the tourism industry (McCartney, Butler and Bennett, 2008; Shani, et al., 2010; Casali, et al., 2020). Destination image influences tourists’ choices and decision-making processes strongly (Casali, et al., 2020). Thus, marketers try to modify the tourists’ destination images through various marketing activities, such as creating and sharing videos. Additionally, tourism marketing has acknowledged the importance of videos, as evidenced by the abundance of video clips pouring out as a form of advertising and social sharing. In tourism marketing, especially, marketers try to produce advertising videos (Alamäki, Pesonen and Dirin, 2019; Huertas, 2018; Shani, et al., 2010), and tourists create and share their own travel videos (Sigala, 2011) to improve user accessibility to those clips. However, relatively little effort has been made to determine the features that construct such images in videos.
Online videos are digital audio-visual communication media that have become an essential part of today's digital communications and marketing (Liu, et al., 2018). We already know that videos or films are able to trigger consumer behaviour (e.g. Weiss and Cohen, 2019; Shani, et al., 2010), such as increasing purchase or visiting intention (Alamäki et al., 2019; Du et al., 2020). Although a few studies have reviewed the role of videos on consumer behaviour (e.g. Alamäki et al., 2019; Huertas, 2018; Puccinelli, Wilcox, and Grewal, 2015; Weiss and Cohen, 2019), online marketing videos remain surprisingly under-researched, although their role has rapidly increased in destination marketing. For example, we know little about the factors that influence tourists’ behaviour while they watch different video content. Previous research (Alamäki et al., 2019; Liu et al., 2009; Sun and Cheng, 2007) has shown that videos as multimedia presentations are able to facilitate understanding and trigger emotions. We posit that people formulate different destination images in their minds depending on the kinds of videos they watch. A complex process that activates different areas in the brain, a thorough understanding of the mental imagery process is still in its infancy (Boccia et al., 2015). Despite this, we need more research on imagery processes in tourism (Yin, Bi and Chen, 2020).
According to the research on emotional association (Fowler and Christakis, 2008; Hatfield, Cacioppo, and Rapson, 1993), the positive emotions of an individual might transfer to others. Although happy faces and visuals of people are very common in destination marketing material, surprisingly little research has been conducted on how the visuals of faces and people overall trigger tourists’ behavioural changes and how this process evolves. Thus, we need to know more about the factors that trigger consumer purchase behaviour in destination marketing activities. In this study, we propose a conceptual model mainly consisting of emotions, social presence and entertainment value. This study focuses on factors affecting consumer behavioural change in a nature tourism activity setting in a Nordic nature tourism environment. Our sample represents mainly young tourists who prefer audio-visual information (Fong, Firoz and Sulaiman, 2017), have strong digital skills (Veiga, et al., 2017), and are also increasingly interested in sustainable tourism activities such as nature tourism (Buffa, 2015). We compare the ability of different types of video stimuli to elicit entertainment, social and emotional responses among different consumers. In this study, we aim to answer the following research questions (Q):
How do variations in human-related images in video content trigger changes in consumer behaviour intention and emotions (Q1) How do consumers’ individual characteristics influence their responses when watching different destination video content? (Q2) What is the role and interrelationship of entertainment value and social presence in affecting consumers’ behavioural intention in destination video marketing? (Q3)
This paper is organized as follows: the following section focusses on a review of the related literature. Subsequently, the methods, procedure and data analyses are described. Then, the findings obtained from the experiment are discussed. The last section identifies the contributions, theoretical and managerial implications, and limitations of the study.
Related work
Imagery theories and human faces as stimuli
Image theory (e.g. MacKay and Fesenmaier, 1997) or mental imagery theory (e.g. Coman and Rauh, 2003; Pylyshyn, 2002) explains how users form mental images from, for example, tourism destinations or social situations. When a user hears the word ‘Thailand’, his or her inner schema related to Thailand is activated, and he or she imagines beautiful beaches and palm trees, for example. Visuals, such as videos, help the individual to reconstruct new schemas. Similarly, users chatting with friends on social media feel socially connected, which evokes mental images concerning this social interaction (Rodríguez-Ardura and Martínez-López, 2014). Rodríguez-Ardura and Martínez-López (2014) showed that mental imagery and telepresence are interrelated concepts where online presence creates cues that reconstruct mental images. They continued that mental imagery facilitates both cognitive and affective dimensions and promotes hedonic value. However, prior research (Cowan, 2000; Miller, 1956) has shown that individuals’ cognitive capabilities are limited, which evidently influences their ability to acquire information from media. For example, the cognitive load theory (Bannert, 2002; Mayer, 2009; Sweller, Ayres, and Kalyuga, 2011) explains how limited cognitive capacity in processing novel information slows down remembering and learning. In marketing videos, multiple simultaneous cues evidently hamper consumers’ ability to process information.
People easily pay attention to human faces. The previous research shows that faces are often stronger emotional stimuli for humans than any other stimuli (Bindemann, et al., 2005; Cerf, Frady & Koch, 2009). People notice faces even unintentionally although they are not the main object in the scene (Langton, et al., 2008). The research of Crouzet, Kirchner and Thorpe (2010) points out that people capture or catch faces very rapidly when they appear in the field of view. They calculate that human face recognition happens even in 100 milliseconds, which is faster than recognition of animals in sight. Thus, human faces are stronger stimuli than the movement of animals or text on the object, (Cerf, Frady & Koch, 2009; Crouzet, Kirchner & Thorpe, 2010) not to mention other objects. This is not a culturally dependent characteristic, unlike the impact of some characters of opinion leaders (cf. Marshall and Gitosudarmo, 1995). In addition to human faces as a stimulus, opinion leaders are able to influence the attitudes and behaviour of other consumers (Li and Du, 2011; Li, et al., 2013). Similarly, celebrity endorsers also have a strong influence on consumers’ attitudes and behavioural intentions (Van der Veen and Song, 2014). However, people recognize unintentionally and very rapidly any human faces by their innate abilities or abilities learned in early childhood, whereas opinion leaders or celebrity endorsers are rather leaders whom some people want to follow and imitate. Thus, the faces of opinion leaders and celebrity endorsers are often used in marketing because people want to identify with them.
Emotional contagion and affective changes
Emotion research (DuBrow et al., 2017; Knutson and Greer, 2008) has increased our understanding of human emotional cognition, its context dependency and the integral part it plays in consumers’ choices. In consumers’ choices, emotions and rational thinking are closely intertwined (Genevsky et al., 2017). However, high attention to advertising may weaken the effect of emotional content (Heath, Brandt and Nairn, 2006). Furthermore, Heath et al. (2006) show that emotional content is more effective than rational messaging in developing strong brand relationships. In the elaboration likelihood model, Petty and Cacioppo (1986) showed how attitudes change through a logical or ‘less logical’ evaluation of arguments, messages, or cues. The less logical evaluation represents attitudinal changes where stimuli are interpreted without a deep evaluation of logical arguments, and cues are emotional messages rather than factual arguments. In short marketing videos, we expected that visual cues affected consumers through less logical evaluation. Previous research (Liu et al., 2018) on short marketing videos pointed out that viewers experience a sample of emotions from the topic, product, or destination that a video aims to market.
In psychology, emotional responses are typically measured via a fundamental two-dimensional affective space model: emotional valence, indicating the hedonistic value (positive vs. negative), and arousal, indicating the emotional intensity (from low to high; e.g. Bradley and Lang, 2000; Russell, 2003). Valence is the subjective value of the emotion, which runs from positive to negative, whereas arousal is the intensity of this emotion (Kauttonen and Suomala, 2019; Knutson and Greer, 2008; Osgood, Suci, and Tannenbaum, 1957; Watson, Wiese, Vaidya, and Tellegen, 1999). When consumers experience a pleasurable marketing video, positive arousal increases, and when they experience an unpleasant video, negative arousal increases. Studies have found that positive arousal has an important effect on people's behaviour towards the issues that triggered this positive arousal (Knutson and Greer, 2008). Subjective goals are essential factors behind consumers’ behaviour (Suomala et al., 2017), and destination image especially influences consumers’ behaviour when they decide where to go for holiday (Huete Alcocer and López Ruiz, 2020).
Predictions related to visual effects
In video content marketing, the viewers of video clips experience a sample of emotions from real-life service experiences (Liu et al., 2018). Previous work has shown an increase in the understanding of human emotional contagion and context dependency, and it is being increasingly thought that emotions are an integral part of the choice and purchase behaviour of consumers (DuBrow et al., 2017; Knutson and Greer, 2008). Human faces provoke a stronger visual effect than other objects (Bindemann, et al., 2005; Cerf, Frady & Koch, 2009). Additionally, previous literature (e.g. Heath et al., 2006) reported that emotional content can create stronger effects than rational messages, and emotional cues alone are able to enhance the choice (Vakratsas and Ambler, 1999). In addition, choice preference may change, although individuals do not process emotional cues carefully or intentionally (Petty and Cacioppo, 1986). Sometimes high attention may even weaken the emotional effect (Healt et al., 2006). On the basis of previous research (Fowler and Christakis, 2008; Hatfield et al., 1993; Healt et al., 2006), we predicted that the types of video and visuals of human beings (including images of faces, individuals, or groups) enhance emotions, social presence, and behavioural intention through emotional association or contagion.
We already knew (e.g. Alamäki et al., 2019) that gender, age, nationality, and outdoor activity might affect viewers’ responses in watching videos. However, we did not know how visuals enhancing emotional contagion would affect different participants. The mental imagery theories (e.g. MacKay and Fesenmaier, 1997) show that mental models are different depending on the experiences and backgrounds of participants, and this variation affects mental imagery in various contexts. Thus, consumers interpret the content of videos subjectively. Their valuations are formed from different mental and external sources (Levy and Glimcher, 2012). In addition to illustrating value propositions, online videos should be able to capture consumers’ attention and interest (Hartmann, Apaolaza, and Alija, 2013). Consumers should enjoy watching the videos and want to share them while surfing social media sites. Using the aforementioned predictions, we aimed to test the following hypotheses:
Entertainment value and social presence
Social presence is an essential variable connected to the values that consumers gain while interacting with digital online services (McKenna and Bargh, 1999). Social presence deals, for example, with the intimacy and warmth an individual may perceive in a virtual environment (Short et al., 1976; Kim, Sung, and Moon, 2020). Kim, Merrill and Yang (2019) state that social presence refers to feeling as if someone is present when they are actually physically away. In addition to the social interaction with other humans through chats, discussion forums and emails, lean media can convey the feeling of social presence. Pictures or text content may also convey a social presence such as photographs and personal letters can present (Gefen and Straub, 2003; Lim and Lee-Won, 2017). Today, various interactive digital channels easily and rapidly enable imagery to be delivered. Both self-made and professionally produced videos constantly share social feelings and faces of ordinary people, opinion leaders and celebrity endorsers (Alamäki, et al., 2019). Most consumers have smartphones with continuous connection to various social media channels, causing a constant flow of various stimuli for people—faces of people, social groups, natural and synthetic environments and other objects. In this study, we review social presence from the perspective of video material where videos display the feeling of social presence through video clips of individuals and social groups.
Entertainment value is a strong motivational variable for consumer choice (Christiansen, Comer, Feiberg, and Rinne, 1999). Previous studies have found that increasing entertainment values result in more frequent visits and use and, further, elevated profitability (Christiansen et al., 1999; Dehghani, Niaki, Ramezami, and Sali, 2016; Hausman and Siekpe, 2009; Mathwick, Malhotra, and Rigbon, 2002). Entertainment value is a concept motivated by both hedonistic and eudaimonic reasons (Oliver and Raney, 2011). According to Oliver and Raney (2011), the former is more related to pleasure-seeking motivation while the latter is more related to truth-seeking motivation, and both fall under entertainment motivation.
Predictions related to social presence, entertainment value and behavioural intention
On the basis of the aforementioned research (e.g. Heath et al., 2006; Vakratsas and Ambler, 1999), emotional cues in marketing videos may be more effective compared to rational messages. Therefore, we used mute video clips without audio and text, as rational messages are typically presented as audio and caption (text) on videos. As entertainment value is a crucial concept in the success of marketing and sales (e.g. Christiansen et al., 1999; Dehghani et al., 2016; Hausman and Siekpe, 2009; Mathwick et al., 2002), we measured the entertainment value in relation to other variables. Individuals may also form new attitudes through social attachments (Petty and Cacioppo, 1986). Thus, we predicted that the feeling of social presence and entertainment value may transfer to the viewers of marketing videos and affect their behavioural intention. The following four hypotheses were postulated for further analysis:
Methods and data
Participants and design
The participants were 169 undergraduate students from two Finnish universities of applied sciences. Data were collected in a class setting using a convenience sampling approach. The control video (video A), the video showing single people canoeing (video B), and the video showing a group of people socializing (video C) were each shown to groups of 56–57 participants, resulting in a total of 169 participants. The size of the test groups was in line with similar multimedia studies that had 30–60 participants, on average, per group (e.g. Fiorella and Mayer, 2016; Mayer and Estrella, 2014; Sung and Mayer, 2012). We collected various socio-demographic background variables from the subjects. The first three concerned gender, age and nationality. In addition, the questionnaire consisted of multiple-choice items concerning participants’ frequency of participation in outdoor activities (hiking, kayaking, mountain biking or related) and their perceptions of themselves as travellers (occasional, regular, or experienced). All variables and socio-demographic profiles are presented in Table 1.
Participants’ socio-demographic data.
Design and modification of marketing video
We first created a video that included real video material from a canoeing trip at Nuuksio National Park in Finland. The business goal of the video material is to market the beauty, peacefulness and cleanliness of Finnish nature for outdoor tourists. The video material was first video-recorded by using a drone, allowing a ‘bird-perspective’ in the video material. Hence, the video material presented long-distance footage of a blue sky, lakes, forest, rocks and canoeing tourists from a high visual perspective. The video was one minute 30 s long. This video (video A) formed a control video in our experiment, as we changed 50% of the video material in two other test videos (see Figure 1). We added individuals to the second video (video B) by replacing 50% of the material with video clips of individuals’ facial images. The video clips and one still image presented human faces (smiling, happy or satisfied-looking faces) in the near distance. The third video (video C) included social situations where several individuals were interacting as a part of a tourism group trip in Nuuksio National Park. We used a student group who went canoeing with a tourism guide from a nature tourism company. All three videos were silent without an audio track.

Examples of the control video clips (above) and two other videos where 50% of the content was replaced with face video clips (middle) and social group video clips (below). All the videos were silent, without audio and caption (text).
Design and procedure of experiment
We employed a control video (A), ‘faces’ video (B) and ‘social groups’ video (C) for each of the different test groups (A, B or C). The videos were designed for smartphones. The experiment took place in classrooms, where each participant watched their group's video. There were 18–32 participants at one time in the classroom where the experiment was conducted. The experiment began with an orientation, in which the researcher explained the purpose and procedure of the experiment. The researcher also distributed the written instructions that included the QR code and TinyURL to the test content and questionnaire. The participants were randomly assigned to different test groups. The researcher supervised the experiment procedure, which lasted 30 min on average, including the briefing, experiments and answering the questionnaire. The participants first answered the research items of the pretest before they watched the video. The pretest included socio-demographic background variables and behavioural intention measures. Right after watching the video, the participants answered the research items of the post-test that included the measures presented in Table 2. The study followed the ethical guidelines of the ethics committee authorized by the responsible university. In addition, data were handled and analysed anonymously, and voluntary consent was obtained from each subject prior to participation.
The questionnaire items and their wording.
Behavioural measures
The consumer behaviour-related questions were rated on 5-point (1 = strongly disagree to 5 = strongly agree) and 7-point (1 = very negative to 7 = very positive) Likert scales. After seeing the videos, the participants answered the questionnaire. The questionnaire measures of this study and their wording are presented in Table 2.
Bayesian linear regression analysis
As the first modelling approach, we used linear regression to study the effect of video type (H1–H3) and the socio-demographic background variables to the responses (H4). For this, we fitted a multivariate model Y = F(Xβ), where Y contains responses (cf. Table 2), X contains video type and background variables (as factors), and β contains coefficients to be estimated. As Y was discrete-valued and ordinal, for the function F we used an adjacent-category model with a logit link (Bürkner and Vuorre, 2019). The posterior distributions of model parameters were estimated using Markov chain Monte Carlo methods with the brms R package version 2.1.8 (https://github.com/paul-buerkner/brms; Bürkner, 2017). We constructed four chains of 10,000 steps, including 2000-step warm-up periods; thus, a total of 32,000 steps were retained to estimate posterior distributions for each of the five responses. Convergence of the chains was verified by visual inspection of prediction distributions and ensuring that the potential scale reduction factor R on split chains was equal to 1. We used weakly informative zero-mean normal priors for regression coefficients (SD 5) and intercepts (SD 10) to improve model convergence. Model candidates, that is, sets of predictors in matrix X, were compared against each other using a leave-one-out (LOO) cross-validation scheme (Vehtari, Gelman, and Gabry, 2017). After model fitting, we computed the means of the posterior distributions for the coefficients with their two-tailed posterior probabilities (PPs) against zero as PP = 2*min[P(x > 0),P(x < 0)], where P is the posterior distribution for parameter x. Because of the small number of samples (2 and 8) in the ‘African’ and ‘American’ groups, both were included in the larger ‘Other’ category.
Partial least squares analysis
As the second approach, we aimed to examine how entertainment value and social presence are related to behavioural intention (H4, H5). After this, we put a special focus on the interrelationship of entertainment value and social presence (H6, H7). This data helped us understand if they affected each other differently in different types of videos or if one was mediating the other. For this analysis, we used a partial least squares (PLS) path modelling approach (Hair, Hult, Ringle, and Sarstedt, 2021). The data analysis was conducted using SmartPLS 3 tool with default settings.
In addition to the complete model, we also tested direct and indirect (aka mediation) effects for behavioural intention separately for each video. Mediation occurs when a mediator variable intervenes between two other related constructs. Mediation considers the presence of an intermediate variable or a mechanism that transmits the effect of an antecedent variable to an outcome (Aguinis, Edwards, and Bradley, 2016). Hence, mediation refers to underlying effects that link antecedent and consequence variables. Streiner (2005) implied that the cause of independent variables on a dependent variable might conceive multiple meanings. Here, the interpretation of social presence can be two-fold. Based on empirical findings about the positive causality between social presence and entertainment (Lee et al., 2014), we tested the mediation effect of entertainment value by groups. As part of the analysis, we also checked variance inflation factor values for multicollinearity.
Finally, we note that the PLS constructs used in this study have exploratory tendencies; thus, we put more weight on gaining insights into practices in the tourism industry than on the robustness of the model itself. In fact, it is quite common for data not to adhere to normal multivariate distribution in social and behavioural sciences, and variance-based PLS is recommended more than maximum likelihood estimation methods (Hair et al., 2021). Chin (1998) also stated that PLS is less strict with smaller sample sizes.
Results
Tests for reliability and group differences
To test how reliable our summary constructs were, Cronbach's alpha (standardized) and average inter-item correlation were computed. These values are depicted in Table 3. All Cronbach's alpha values were greater than the recommended threshold value of 0.7 (Nunnally and Bernstein, 1978), demonstrating that the construct reliability of the measuring model was acceptable.
Values for reliability of summary constructs.
Next, we tested whether background variables were different between video Groups A, B and C. Kruskal-Wallis test statistics for the ranks of the ordinal variables ‘age’, ‘outdoor activity’ and ‘tourism activity’ were 0.988 (p = 0.61), 2.440 (p = 0.26) and 2.314 (p = 0.31). Chi-squared test statistics for the nominal variables of gender and nationality were 3.33 (p = 0.19) and 11.68 (p = 0.31). With p > 0.05 for all variables, no statistically significant differences between the groups were found. Thus, they had similar background variables irrespective of the video they watched, and it is safe to assume that the video is the change agent.
The results of visual cues
First, we analysed the pre- and post-questions of behavioural intention before and after watching the videos by comparing their average ratings. We found that the mean of behavioural intention rating was reduced for all three groups after watching the videos. We used a permutation test to evaluate the statistical significance of results by randomly permuting the view-order for subjects 100,000 times and counting how many times the unpermuted mean surpassed permuted means. Mean reductions and their estimated, FDR adjusted p-values (in parenthesis) were 0.32 (p = 0.265; Group A), 0.62 (p = 0.043; Group B) and 0.85 (p = 0.013; Group C), hence the reduction was statically significant at p < 0.05 for Groups B and C. Therefore, opposite to our hypotheses 2 and 3, human faces and social groups in videos decreased the behavioural intention. As a response to the first part of Q1, we conclude that the effect of human-related video content for consumer behaviour intention was negative.
We then turn to the linear regression modelling. After testing various models involving different predictors, we chose the following:
Linear fit coefficients. Bolded values indicate statistical significance measured by two-tailed posterior probability (PP) density with PP < 0.05 or less. Arousal was omitted here (PP>0.05 for all coefficients).
= reference level (set to zero).
PP < 0.05.
PP < 0.01.
PP < 0.001.
For the video type, we found PP < 0.05 or less for three responses: Social Interaction, Entertainment Value, and Valence. While the first result was expected, that is, more social presence for videos B and C by design, the latter two were less obvious. Video A was found more entertaining and resulted in more positive emotions (measured by valence) than the other two (hypothesis 1). We found no differences between videos B and C. Interestingly, videos B and C had no effect on consumers’ behavioural intention even if they did arouse feelings of social presence (hypotheses 2 and 3). As a response to the second part of Q1, we concluded that the effect of human-related video content for emotion was either neutral (Arousal and Emotion Level) or negative (Valence).
Background information also affected the responses (hypothesis 4). Males gave lower ratings for Behavioural Intention, Emotion Level, and Valence (PP < 0.05 or less). Furthermore, using the extended model with the interaction term ‘Group:Gender’, we found that females gave higher ratings than males for video B (‘faces’) for the following three responses: ‘Social presence’ (PP<0.05), ‘Valence’ (PP<0.01) and ‘Emotion Level’ (PP<0.05). Being more active outdoors tended to increase responses, particularly for Entertainment Value (three coefficients with PP < 0.05 or less) and Valence (two coefficients with PP < 0.05). An increase also occurred for Behavioural Intention and Emotion Level; however, it was weaker (one coefficient with PP < 0.05 or less). Finally, the nationality of the respondents had an effect on Behavioural Intention and Valence with notable differences between Asian and European subjects. Behavioural intension was highest among non-Asians and Europeans. In particular, Valence was rated highest by Finnish participants and lowest by Asian participants, with PP < 0.01 for the difference. As a response to Q2, we concluded that gender, nationality and outdoor activity had notable influence on responses. In particular, females rated video B higher for social (Social Presence) and emotional scales (Valence and Emotion Level).
We have summarized our findings from the pre- and post video watching behavioural intention and the regression analyses into Table 5.
Result summary for the linear regression analysis.
The results of social presence and entertainment value
As the first step in PLS analysis, we computed cross loadings for latent variables to evaluate discriminant validity, which is understood in the way that constructs or latent variables are independent from one another (Hair et al., 2021). Values are listed in Table 6. Bolded values are distinctively higher than any other values in each column. Average of variance extracted (AVE) values for the three measures were 0.680, 0.768 and 0.603. All are over 0.50, which means that convergent validity was obtained. Thus, we can say the measuring model was internally reliable, and the constructs in the measuring model obtained discriminant validity.
Cross loadings for the three latent variables.
Results for the complete model
Path coefficients of the full model are depicted in Figure 2 with the corresponding coefficients listed in Table 7. According to the coefficients, it can be noted that hypotheses 5 and 6 are supported.

PLS path coefficients for the complete model.
Summary of the results for the complete model.
Since the purpose of this study is to investigate the distinctive effects of different types of videos on potential consumers’ perceptions, we repeated the computation for each group to observe differences. The results are presented in Table 8.
Summary of the results for the complete model for three groups (video types A, B and C).
Interestingly, even though the subjects in Group A felt social presence from the video about the beauty of nature, they were not likely to attain behavioural intention (of purchasing, consequent communication, and recommending to others). Instead, they seemed to be influenced by the entertainment value of the video.
Subjects in Group B, who were exposed to the video with single individuals, showed a stronger relationship between entertainment value and behavioural intention. Presumably, the atmosphere of the video affected those subjects.
The relationships between social presence and behavioural intention were statistically not significant for Groups A and B. According to the literature (e.g. Gefen and Straub, 2003; Lim and Lee-Won, 2017), social presence is not a necessary condition of visuals of humans. That is, an individual may feel social presence where there are no visuals of human being in the stimuli, such as in a letter. In other words, it is possible that social cues and social presence are related to each other but not hugely influenced. It has been believed that social presence affects behavioural intentions such as purchasing propensity.
In Group C, where social cues were provided, subjects’ mindfulness seemed to be activated. In this group, the relationship between social presence and behavioural intention became stronger while the relationship between entertainment value and behavioural intention became weaker.
As a response to Q3, we can confirm that entertainment value had a strong positive effect on consumers’ behavioural intention in all groups. Social presence related to social groups had a notable, yet smaller, positive effect, while faces had none.
Results for direct and mediation effects
Results for direct effects and mediation effects for each group are shown in Figure 3.

Direct effect and mediation effect.
The path coefficient for the direct effect of Group A was 0.361 where the t value was 2.219. Thus, the relationship was statistically significant. However, when we added a mediating construct, entertainment value, the relationship between social presence and behavioural intention became meaningless. Instead, we can see that both the relationships between social presence and entertainment value and between entertainment value and behavioural intention were significant with higher r squares. Thus, it can be said that the impact of social presence on behavioural intention can be accomplished by the entertainment value of the video. If a marketing video is not entertaining, we cannot expect positive behavioural intention. Additionally, if we are able to increase social presence in the video, we may be able to make people feel that the video is entertaining.
Group B showed a result similar to Group A's. Interesting results can be found from Group C. The direct effect was 0.545 where the t value was 5.306, which is higher than Group A and Group B. As explained earlier, injecting social cues in the video might work throughout the whole experiment. Unlike Groups A and B where entertainment value fully mediated the relationship between social presence and behavioural intention, in Group C, entertainment value partially mediated the relationship between social presence and behavioural intention.
We have summarized our findings from PLS analysis in Table 9.
Result summary for hypotheses 5–8.
With respect to Q3, this result suggests that the interrelationship between entertainment value and social presence is such that the former acts as a mediator for the latter. A notable direct effect was found only when social group activity was present, not just faces.
Discussion
Overview of the results
On the basis of previous research (Fowler and Christakis, 2008; Hatfield et al., 1993; Langton, et al., 2008), we predicted that the visuals of faces (video B) and social groups (video C) embedded to the visuals of scenery create more positive effects through emotional contagion than the control video (video A) without human faces or social groups. Counter to our initial prediction, videos B and C did not create a higher effect compared to the control video. The control video even performed better in some measures, as it was found to be more entertaining and resulted in more positive emotions than the other two.
We conclude that beautiful views without the visuals of human beings perform better in the context of nature tourism, as consumers might be seeking peace and silence rather than social activities. Thus, this study points out that the visuals of human beings, namely a psychological mechanism in emotional cognition, might negatively affect destination marketing videos. We conclude that the video clips of unfamiliar people hindered emotional connections to video messages in our study. It seems that the visuals of human beings are strong emotional stimuli whose effect and direction is difficult to foresee while designing destination marketing videos.
Gender, nationality and outdoor activity affected the results. There was a negative impact on all ratings for males, particularly for behavioural intention, emotion level, and valence. Differences were also noted among Asian, Finnish and European participants, with higher valence and lower behavioural intention and entertainment value. The frequency of outdoor activity also affected the responses. Even yearly activity had a positive impact on all the responses, apart from social presence and arousal. Weekly activity in outdoor activities had a major positive impact on behavioural intention. Thus, we emphasize the importance of customer segmenting and context analyses in designing destination marketing videos.
In the beginning of the study, we predicted that social presence and entertainment value positively affect behavioural intention, and they may be interrelated. We tested four hypotheses that were supported. An interesting finding was that social presence enhances entertainment value. This emphasizes the importance of entertainment value in designing of destination marketing videos. We found that entertainment value fully or partially mediated the relationship between social presence and behavioural intention. When we added a mediating construct—entertainment value—to our model (Figure 3), the relationship between social presence and behavioural intention became meaningless. Thus, we drew the inference that the impact of social presence on behavioural intention can be accomplished by entertainment value. We conclude that, in destination marketing videos, social presence has a direct effect on the entertainment value felt by consumers.
Theoretical contribution
This study shows that the control video was found to be more entertaining and resulted in more positive emotions when there were fewer direct visuals of human beings in the context of nature tourism marketing. Previous research (e.g. Hartmann et al., 2013) has addressed the effect of exposure to natural scene visuals on behavioural intention. This supports our findings regarding why the control video performed better than the other videos. However, we did not find any evidence for prior research (e.g. Kendall and Kendall, 2017) that indicates that social content positively affects behavioural intention. In our study, adding further social information to videos decreased the media effect. This finding aligns with the study of Fiorella and Mayer (2016), who reported that additional human-related information in learning videos disturbed the attentiveness of students by decreasing their media effect.
The results indicate that entertainment value created a stronger media effect than did social presence alone (Figure 3). Entertainment value is a strong motivational variable in consumers’ choice (Christiansen et al., 1999), making it an important variable for marketers and designers. In addition, the findings point out that entertainment value as an emotional variable plays a crucial role in influencing the behavioural intention of consumers through video content marketing. Social presence created a direct effect on behavioural intention only if the video material clearly presented social situations (video C).
The valence of emotions was rated the highest by the Finnish participants and lowest by the Asian participants. It is likely that the visuals of Finnish nature and human beings engaged in nature tourism activities aroused different memories, experiences, and expectations among the Finnish participants and their Asian counterparts, reflecting their different cultural backgrounds. This confirms the influence of consumers’ mental schemas (e.g. Anderson, 1985; Norman, 1982) and imagery theory (e.g. MacKay and Fesenmaier, 1997; Pylyshyn, 2002) on their interpretation of video messages and visuals. The mental schemas of consumers, constructed by their prior experiences, facilitate their mental imagery while watching video presentations.
This study shows that the participants interpret emotional cues in their mental imagery and construct new feelings on the basis of their existing mental schemas related to the topic. The processing of visual cues and their logical connections affects the mental imagery of consumers, promoting or hindering feelings of entertainment and social presence. These feelings potentially affect behavioural intention. This study also shows that a statistically significant connection between two variables is not necessarily as clear as it might appear. When we added entertainment value to our analysis model as a mediating construct, the relationship between social presence and behavioural intention became meaningless. This emphasizes the role of entertainment value in creating successful marketing videos. This aligns with the research of Barcia et al. (2015), who showed that mental imagery is a complex process, activating different areas of the brain. Similarly, marketing videos may activate different areas of the brain than marketers originally expected. A rational message is turned into an emotional one and vice versa.
Practical contribution
Most marketing videos present human beings selling, presenting, demonstrating, or recommending products or services. This study shows that using human beings in destination marketing videos of nature tourism is not always recommended, for it does not necessarily improve behavioural intention or enhance positive emotions among travellers and other consumers. In our case, nature tourism marketing, including the visuals of beautiful views alone, created the most entertaining and positive emotional responses. If the marketing goal is purely to increase the feeling of social presence or show ‘reference users’, human beings are recommended to be used in videos. This study shows that the increased feeling of social presence did not improve the behavioural intention in all of our test videos. In other words, liking of videos does not necessarily mean changed behaviour (cf. Sung and Mayer, 2012).
This study mentions that entertainment value plays an essential role in creating effective video content marketing. The feeling of social presence in videos may also be entertaining just as simple video clips of beautiful views. Entertainment value promotes purchase intention, recommendation, and communication, which are important marketing goals of videos. This study suggests a customer-centred design approach to marketing videos, especially in designing emotional cues. We found that videos differently affected males and females, Finnish and Asians, and active and passive tourists. Different customer profiles require different visuals, as they need to fit their expectations and prior experiences. In addition to customer profile analysis, this study suggests context analysis to define the key features of value proposition in each marketing object. Marketing destinations, products, or services require different emotional cues and visuals, especially abstract value propositions of experiences that need to be defined carefully, for they are not easy to visualize in nature tourism.
Furthermore, this study points out that video tourism marketing is a complex phenomenon. Videos are a richer medium than are images, text, or audio, as they are able to convey non-verbal messages and emotional cues through, for example, digital storytelling. Thus, a richer medium might also cause interpretations by consumers that are more complicated. Therefore, tourism marketers and video designers cannot easily predict the kinds of mental images that visual and emotional cues in videos cause in the brains of consumers. The findings indicate that the feeling of social presence—namely the sense of human contact, engagement, sociability, warmth, and human sensitivity—might also create entertainment value. It was a different destination marketing message than the researchers originally intended.
Limitations and future research
Unlike prior research, this study explains the causes of media effect when showing human beings in different ways in videos and the relationship between social presence, entertainment value, and behavioural intention. As usual, there are limitations to this study. First, self-report measures were used. This may cause a concern of bias, for example, in terms of social desirability. Hence, future research could use other sources of data—such as the biometric measurements of consumers’ feelings—in order to diminish this concern. Second, the generalizability of results could be limited because of the convenience sample used in the context of nature tourism marketing. While our model included various individual characteristics, there are potentially more covariates that could affect the responses, particularly in a traveling and tourism context, such as wealth, relationship and parental status. These could be relevant when sampling outside a student population. Thus, future research should define boundaries where these results may apply, including cultural differences.
Third, although video clip stimuli are effective in eliciting emotions and popular in psychology and neuroscience research, they are also problematic, as they can contain multiple overlapping features (Sonkusare et al., 2019). To mitigate this, 50% of the content was shared among the three stimulus videos, and the basic low-level visual features luminance, contrast, and motion were highly similar. Thus, the moderating effect was most likely related to the high-level content of videos, such as the presence of human beings and social interaction. Finally, data were collected from a limited number of consumers, which limits the statistical power of the results. In particular, sample count in some subgroups (e.g. nationalities) was small. This was counterbalanced by using models that can cope better with smaller data (Bayesian linear regression and PLS). The exposure to emotional cues or social attachments was also rather short in our videos compared to previous studies of emotional contagion (Hatfield et al., 1993; Heath et al., 2006). Nevertheless, the role of entertainment and other variables merits further examination in the context of media technology. In addition, more research is needed to determine which specific features of rich media content create an actual media effect in different contexts and how they affect the behavioural intention or actual behaviour of consumers.
Footnotes
Acknowledgements
We thank Virtual Outdoors-project for the video clips and Dr Amir Dirin for assisting us with collecting the research data.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Helia-foundation, European Rural Development Fund and FESS1-project.
