Abstract
Social networking sites (SNS) provide adolescents with opportunities for content generation on a wide range of social issues, providing unique insight into the psychosocial development of adolescence. We explored SNS webpages viewed by a random sample of adolescents during the initial uptake of SNS use (2005) to describe their general language use. Adolescents aged 14 to 17 with home Internet access were recruited using list-assisted random digit dialing methods. All SNS (MySpace) webpages viewed by participants were captured, and a large, structured set of texts (text corpus) was created from the profiles and message boards therein. Using concordance software, word frequency and keyword associations were analyzed. The 346 participants viewed approximately 28,000 MySpace pages, yielding a 1,147,432-word text corpus. Profile sections presented information about the content creator, while message boards focused more on short conversations with recipients. The most common content word was the term love. Profile owners would profess their love for activities, such as dancing, partying, or shopping, followed by their love for family, friends, and significant others. SNS offer teens an opportunity to describe and share feelings about people, places, and things connected to a range of activities and social contacts within their online and offline environments. Better understanding of SNS can offer strategies to adolescents and health care providers for insight into what connects young people in a community.
Introduction
Almost all teens (95%) aged 12 to 17 in the United States use the Internet, and 80% of those online teens are users of social networking sites (SNS) such as Facebook or MySpace (Lenhart et al., 2011). SNS are hubs of teenage communication, in which adolescents use their online profiles to communicate with their friends in various ways. Despite the popularity and potential benefits of social networking, concern has been raised over information sharing and the language used by adolescents on SNS, particularly with regard to their display of health risk behaviors, such as alcohol consumption and tobacco use (e.g., Jenssen, Klein, Salazar, Daluga, & DiClemente, 2009; Moreno et al., 2010a; Moreno, Brockman, Rogers, & Christakis, 2010b; Moreno, Parks, & Richardson, 2007; Moreno, Parks, Zimmerman, Brito, & Christakis, 2009). Far less attention has been devoted to the positive messages about adolescents’ lives on SNS, in which social networking profiles offer a “wealth of intimate, candid, and publicly available information on a wide range of social issues pertinent to adolescence that contribute to the understanding of adolescent development” (Willams & Merten, 2008).
SNS, through their dual role of content generation and networking provide a window into adolescent health and development. The networking aspects of these venues provide adolescents with opportunities for psychosocial development, in terms of identity, intimacy, and sexuality, and promotion of self-esteem and well-being (Valkenburg, Peter, & Schouten, 2006). Connectedness is one of the cornerstones of an asset-based model of adolescent health and well-being (Resnick et al., 1997). Writing one’s experiences in a non-online setting can have positive health and social benefits, for example, reducing need for health care visits and improving grade point averages in college students (Pennebaker & Francis, 1996). Those who wrote about their deepest thoughts and feelings about going to college for the first time had better health outcomes than those who wrote on more superficial topics. Those students using positive emotion words had better outcomes than others in terms of physical health improvement.
Analysis of language content in SNS has shown that specific language matters, with certain comments potentially impacting well-being. A study of a Dutch networking website found that social self-esteem and well-being were enhanced by positive responses by peers to their profiles, and vice versa (Valkenburg et al., 2006). A study of the MySpace profile comments of young women characterized their purposes as: friendly greetings/inquiries. expressions of affection/encouragement, suggestion/confirmation of plans, personal asides/inside jokes, exchanges of information/news, and entertainment (Walker, Krehbiel, & Knoyer, 2009). Others have examined the expression of positive emotions between friends for U.S. and U.K. MySpace members, resulting in two related hypotheses: that members connect with (“friend”) others with similar levels of public emotion expression or that the expression of emotion in MySpace is “contagious” (Thelwall, 2010). Most comments within MySpace profiles were short (less than 57 words long), and were for “general friendship maintenance.” In addition, 22% of comments in his study contained an expression of “love,” mostly in a friendship—rather than romantic—context.
MySpace was the dominant SNS as online networking became more mainstream. In 2005, MySpace had approximately 22 million unique visitors, quickly increasing to approximately 56 million unique visitors in 2006, compared with Facebook’s 15 million unique visitors at that time (Facebook was not open to adolescents until September 2005, as the previous policy required a valid email address from a university or a selected group of secondary schools and businesses; Comscore, 2006). In 2006, approximately 55% of all online U.S. adolescents used SNS, 85% of whom noted using MySpace most frequently, compared with only 7% on Facebook (Lenhart & Madden, 2007). Facebook would go on to dominate SNS on a global scale after 2008, but not before MySpace initiated the trend.
The application of corpus linguistic techniques to text generated in an online context has previously been applied to adolescent messages about health (Harvey, Brown, Crawford, Macfarlane, & McPherson, 2007). Researchers noted that some words appeared in the messages sent to the United Kingdom “Teenage Health Freak” website more often than expected, including “worried” and “normal.” Such analysis of naturalistic written text can help health care providers understand language used by young people to better understand and exchange meaningful information about their health concerns. The wealth of information within SNS profiles and messages provide a naturalistic source of information for reflecting and describing influences in the lives of young people.
Corpus linguistic analysis has not previously been applied to a large data set generated from naturalistic interaction among adolescent SNS users. By examining the frequency with which words occur in the corpus, and then exploring the context in which these words of interest appear, we might learn something of the environment within which users are immersed. We performed an analysis of MySpace pages viewed by a random sample of adolescents to describe the language used in public and privately accessible profile and message board comment sections.
Method
Overview
To describe the linguistic features of MySpace language content, a text corpus was created from the profile and message board sections from all MySpace pages viewed by 346 adolescents aged 14 to 17 who participated in the Tracking Teen Trends (T3) study. The webpages visited by this sample of adolescents were studied as a secondary data analysis of content generated from the T3 study, the goals of which were to examine the associations between exposure to Internet sexual content and attitudes, beliefs, and behaviors of youth (Salazar, Fleischauer, Bernhardt, & DiClemente, 2008). Actual web usage was observed using specialized, proprietary software (http://www.comscore.com). All content was reviewed, and only unique profile and message board comments were included. All identifiers, such as individual names, usernames, and locations, were removed.
Study Population
Adolescents were recruited via random digit dialing procedures followed by mailed recruitment packets. Between July and November 2004, telephone screening of 175,736 individual phone numbers yielded 1,253 households meeting study inclusion criteria. Eligible households were located in the contiguous 48 states, had an Internet connection, and had a resident aged 14 to 17 who reported using the Internet at home at least once during the month preceding the call. Parents or guardians in qualified households were mailed a recruitment packet (n = 1,243), which contained T3 web-tracking software on CD-ROM and study incentives.
Enrollment in T3 required parental consent, adolescent assent, and installation of the web-tracking software on a home computer with a Microsoft Windows-based operating system that was used by the participant. The web-tracking software did not require de-activation of parental filters or controls. During the recruitment period, 591 subjects were enrolled and attempted software installation. Of these, 58.5% (n = 346) subjects became “active” (i.e., their web traffic was successfully recorded on study servers). On installation of the T3 tracking software, some participants never transmitted web activity; this was likely attributable to participants’ computers flagging the T3 software as invasive Spyware because it tracked and reported on web use.
Among the 346 subjects, 52.3% were female; 80.6% were White, 7.8% mixed race, 5.5% Black, 2.6% Asian or Pacific Islander, 0.6% Native American or Alaskan Native, and 2.9% Other; in addition, 5.1% reported Hispanic ethnicity. Only 3.3% of participants came from households earning (in US dollars) $14,999 or less, 22.9% reported $15,000 to $49,999, 23.6% $50,000 to $74,999, 21.1% $75,000 to $99,999, 21.1% $100,000 to $149,999, and 8% $150,000 or more. There were no differences by gender, race, ethnicity, or parental income comparing the 591 initially recruited and the 346 active subjects.
Content Sampling
We identified MySpace webpages by examining all webpages viewed by T3 teen subjects from their home computer during the 30-day time period between December 2004 and February 2005. Each webpage generated by a participant was routed through secure proxy web servers and stored in a secure interface through which all stored webpages could be accessed, including password protected or otherwise inaccessible content. Each page was reviewed using a browser-like window that displayed embedded text content, images, and video as displayed when the teens viewed them. The interface included a keyword search function able to check URL (“Uniform Resource Locators,” or “addresses”) and HTML (“Hypertext Markup Language”) webpage source code for words, phrases, “tags,” and “metadata” included on the webpage. Using the search term MySpace (not case sensitive), MySpace webpages were identified, and each page was viewed to verify content.
Corpus Creation and Analysis
All of the word content from the profile and message board sections from all MySpace pages viewed by the participants were collected and organized into one corpus. To identify the salient themes appearing in the MySpace corpus, we generated a list of keywords. Keywords are words which best define a text or texts. They are an important indicator of both expression and content (Seale, Boden, Williams, Lowe, & Steinberg, 2007) and have been used by an increasing number of researchers as a reliable means of identifying key themes in characterizing health language data sets (corpora; Harvey et al., 2007; Seale, Ziebland, & Charteris-Black, 2006). Keywords are derived from mechanical criteria and in this sense should not be confused with words deemed to be of significant social and cultural import, words that are intuitively identified by the analyst. The advantage of using statistical keywords is that they remove the a priori biases of the analyst from the identification of themes of significance and interest as they are generated by purely computational measures (Seale et al., 2006). Thus, keywords present the analyst with evidence that a conventional thematic qualitative analysis might obscure from view. The corpus was loaded into Antconc 3.2.0. for MAC concordance software. Word frequency lists were generated, followed by exploration of concordances and collocates (sequence of words or terms that co-occur more often than would be expected by chance) of words of interest. This analysis combination provided a multifaceted description of the linguistic features of MySpace profile and message board content.
Results
The 346 participants viewed 1.2 million webpages, including approximately 28,000 from MySpace. In all, 1,309 unique profile sections were identified (177,584 keywords), and 48,392 unique message board comments were identified (969,848 keywords). Our entire MySpace corpus consisted of 1,147,432 individual keywords. A word frequency list, displaying the top 30 most frequently used words, was generated (Table 1). The most common content word (as opposed to connector words like and and you) was the term love. Initial analysis focused on general language use in the profile and message board sections, and subsequent analysis focused on the use of the keyword Love in context—how it clusters with other words, and the context in which it appears.
Top 30 Keywords Appearing in MySpace Corpus.
General Language Analysis
The majority of language used in the profile sections reflected the profile owner describing both personal demographic information and the types of people they would like to meet. Profile creators would offer personal demographic information, such as name, nicknames, age, hometown, as well as interests and hobbies. Positive language was used considerably more often than negative in these descriptions, and, when talking about the type of people they would like to meet, describing positive attributes and behaviors rather than characteristics that they wanted to avoid. Concordance analysis for the most common connector word, you, showed the context of this word was often an attempt to engage the profile reader in a conversation. After presenting information about themselves, a typical profile might close with the following: “if you wanna know more, send me a message.” The profile sections contained language used to present information about the content creator to an audience. In contrast, the language used in the message board sections focused more on short conversations with the content recipient. Within the top 10 message board sections words, the high frequency of the second person pronoun “you” (as well as its slang usage “u”), an individual person’s “name”, and the interjection “hey” reflect language use that is intended to grab the attention of the recipient for a one-on-one conversation. Short declarative statements were often used in the following ways: to touch base (“hey how’s it going? Haven’t heard from you in a while so I’m just dropping by to say hi.”); to make plans (“We should hang out sometime, preferably when the ice goes away . . . <city>’s not that far, right?”); to comment on photos (“aw, look at that face. you just make me smile.”); or, to continue an extended conversation (“hey oh its cool my dad was running late and we didn’t leave until 6:45. so i am sorry also. yea we can meet up sometime for sure any time just tell me. have a good one.”).
Information detailing high-risk behavior was rare in the total MySpace corpus. No words identifying sexual acts, drinking alcohol, and/or smoking cigarettes or marijuana featured prominently in the top 240 words. When words referencing sexual acts (e.g., “sex,” “f-ck,” or “f-cking”), alcohol use (“drink,” “drinking,” “drunk,” “beer,” “booze”), or other drug use (“smoking,” “cigs,” “pot”) were identified, their use was infrequent and often focused within individual profiles rather than diffusely used throughout the whole MySpace corpus. Words identifying tobacco and/or marijuana use like “smoke” or “smoking,” for example, appeared only 66 times in 3.7% of the 1,309 profiles. The above-mentioned words were also infrequently found in the message board sections.
Keyword “Love” Analysis
Table 2 describes the most common expressions with the use of “Love.” Profile owners would most often profess their love for activities, such as dancing, partying, or shopping, followed by their love for family, friends, and significant others. The most common phrase using love was “love to” (18% of the time being used in this context), describing the creator’s love for some activity. A typical expression was as follows: “im just a really outgoing love to have fun kinda girl.” In the message board comments, the use of the word love was more focused, most frequently centered on the message recipient. The word love was followed by some form of the second person pronoun (“you,” “ya,” or “u”) 48% of the time, a typical example of which is as follows: “i love you <name> ill talk to you later.” Table 3 illustrates the things that participants loved to do. These activities might involve connecting and socializing with other people, implied by “meet,” “dance,” “party,” “play,” “hang out,” and “chat.” Other activities might be undertaken by individuals, such as “write poetry” and “learn.” Table 4 illustrates a random selection of expressions with the use of the keyword love that illustrate these types of usage in more detail.
Most Common Word Clusters Using “Love.”
Clusters Prefaced by “Love to” (N = 456).
Twenty Random Selections With the Use of “Love.”
Discussion
The prevailing common view of adolescence as a time of risk and deficit, to be managed for survival and harm minimization, has been contrasted here with a more positive view of youth as an asset, and adolescence as a time for exploration and increasing fulfillment through self-expression in a trusted community. In our random probability sample of adolescents’ actual Internet usage, as compared with self-report or sampling of available public content through networking sites or Internet search engines, adolescents expressed their “love” and positive feelings for activities and those around them, rather than describing high-risk, negative behaviors. These findings were consistent with the assertions of others that SNS offer adolescents a chance to share and expand on positive feelings, forming meaningful relationships with those around them.
In their description of the asset model of public health, Morgan and Ziglio (2007) emphasized the usefulness of health workers mapping the assets of the community with whom they wish to engage. Taliaferro and Borowsky (2012) similarly encouraged primary care providers to intentionally assess and reinforce adolescents’ “competencies, passions, and talents” within a youth development approach during clinical encounters. This study helps us better understand the assets of U.S. teens, in a naturalistic context and in their own words. They liked to socialize and physically connect, indicated by activities such as hanging out with friends, meeting people, chatting/talking, or cuddling. Some engaged in sports such as skating and snowboarding (the study period was over the winter months). They noted a range of creative activities, such as writing poetry, dancing, listening to music, and singing. These shared messages reflected a group of young people looking for like minds to share the activities they favored.
Previous work has shown that the concept of “health” espoused by young people is wider than that of adults, and includes issues relating to appearance and relationships (Head, 1987). In a study about children and young people’s sense of well-being, U.K. children and teens aged 7 to 16 years identified a range of factors, including family, friends, activities, being safe, and enjoying school (Counterpoint Research, corp creators, 2008). The personal profiles and message boards of adolescents in our study provide an understanding of the assets that bring friendship groups together and the activities that sustain young people. SNS are places where young people reflect the reality and aspirations within their lives. Professionals working with young people can engage them in reflection about the activities that they and their friends “love” in their SNS profiles and postings, to foster connectedness with that community and promote resilience.
Limitations
Our study has several limitations. First, the completion rate of the project was lower than hoped, at approximately 33%: The resulting study population may have been particularly motivated, and happy to let us see their Internet use as it was not likely to be risky. This effect might have been mitigated by the unobtrusive monitoring and the length of the monitoring period; for example, with many participants viewing pornographic content, as noted in a previous analysis of this data set (Jenssen et al., 2009). Second, the study population was skewed toward adolescents from predominantly White and higher socioeconomic backgrounds and may not be representative of the larger U.S. adolescent population. Third, the webpage sample might not be representative of all of the adolescents’ Internet experiences. Many adolescents access the Internet from their homes and also at schools, libraries, and elsewhere (Lenhart & Madden, 2007). In addition, recent studies suggest adolescents are becoming more selective and discreet about the personal information they choose to display online (Patchin & Hinduja, 2010). Their research, however, was focused on public information only, whereas our data included every page that the teen looked at, whether public or otherwise. Fourth, various words can be expressed in other non-text formats—“love,” for example, can be expressed through the use of emoticons and heart graphics, so our corpus may represent an underestimation of its presence (Walker et al., 2009). Finally, the dominant SNS has changed over the last 5 years, with Facebook surpassing MySpace in terms of adolescent use. At the time of data collection, approximately 85% of adolescent profile owners reported that MySpace was the social network profile they used most often. As of July 2011, just one quarter of such adolescents reported having a MySpace profile at all, with over half of teen social media users having an account on Facebook (Lenhart et al., 2011). Conclusions drawn from content analyses of MySpace may not translate to Facebook. Nevertheless, this study reflects interesting findings from the early years of SNS use.
Conclusion
MySpace profile pages viewed by adolescents infrequently contained references to high-risk behaviors. Instead, these pages offered first-person narrative accounts about the profile owners’ demographics, personal interests, and everyday experiences. The most common content word was love, connected to a range of activities and social contacts within their online and offline environments. The personal information and messages shared by these teens reflected positive factors that could sustain them and enhance their well-being.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors received financial support by the AAP Julius B. Richmond Center of Excellence, funded by the Flight Attendant Medical Research Institute, and grant R01-CA140676 from the National Cancer Institute.
