Abstract
Purpose
This study aims to assess the prevalence of weight training content across various demographics as well as public attitudes toward weight training using social media as a proxy for organic sentiment.
Methods
A cross-sectional analysis was conducted using the hashtags #pregnancystrengthtraining (#PST) and #pregnancyweightlifting (#PWL). A new Instagram account was created, and two independent reviewers analyzed the first 100 posts under each hashtag. One hundred sixty-nine posts met inclusion criteria and were analyzed. Each post was evaluated for the demographics of the content creator (skin color, influencer type, professional background, and geographic region) and the sentiment of associated comments.
Findings
Sentiment analysis was performed using two natural language processors (NLP): ChatGPT-3.5 and MonkeyLearn. For both #PSL and #PWL, both NLPs were able to identify majority positive sentiment under analyzed posts. Posts from Europe and lighter skin toned influencers dominated top results, but posts were relatively well-distributed across the follower size and profession of the influencer. In general, analysis of #PWL yielded a lower percentage of positive-sentiment comments when compared to #PST. This trend held regardless of skin tone, occupation of influencer, follower count, or global region, though small sample sizes within each category limited statistical significance for these subgroups.
Discussion
Sentiment toward pregnancy-related weight training content on Instagram is largely positive across both hashtags. Given the demographic consistency shown in our results (despite lower sample sizes in some skin tones), social media attitudes may reflect larger societal beliefs regarding strength training during pregnancy as a positive endeavor. Because top posts did not have a significant correlation to follower count, providing hashtag-based health guidance may be an achievable way for the healthcare community to interact with pregnant women regarding healthy lifestyle behaviors.
Keywords
Introduction
Pregnancy is a unique physiological state characterized by alterations in cardiovascular, musculoskeletal, and metabolic systems. While many of these changes accompany symptoms that promote sedentary behavior, such as nausea, fatigue, and poor sleep quality, attitudes around the safety of exercise during pregnancy for mother and baby have been contested.1–3 Modern attitudes around exercise in pregnancy find their roots in 19th-century recommendations, which advocated for an entirely sedentary “confinement” inside one's home. Many exercises were believed to be harmful even for nonpregnant women, due to a mistaken belief that exercise would displace the uterus and affect fertility. 3 Studies have consistently found that the general public views exercise with weights as unsafe and unnecessary when pregnant. 4 More recent guidance from the 20th century largely recommends simple daily activities accompanied by light exercise, though this was similar to the exercise advised for women overall. 2
Today, multiple studies have shown the maternal benefits of exercise during pregnancy. A study of 843 pregnant women divided into groups based on exercise level found that the high exercise group was associated with a lower incidence of cesarean section. The rate of cesarean section was 6.7% in the high-exercise group, compared with 19.0% in the medium group, 23.1% in the low-exercise group, and 28.1% in the control group. 5 These benefits may be specific to certain maternal health outcomes and may not be fully generalizable. For example, a metareview of 15 randomized controlled trials assessing the impact of aerobic exercise on multiple metrics of maternal health found that, while aerobic exercise decreased gestational weight gain, data surrounding gestational diabetes mellitus was limited and other maternal outcomes lacked sufficient evidence. 6
Additionally, numerous studies have confirmed that exercise during pregnancy is safe for the developing fetus. For example, a study of 320 healthy Caucasian women found that 60 min of exercise 3x/week throughout the second and third trimester of pregnancy did not increase the risk of preterm delivery in healthy women. 7 Caution regarding exercise is typically reserved for activities that increase the core body temperature more than 1.5°C within 30 min, due to potential teratogenic effects. 8 Additionally, the American College of Obstetricians and Gynecologists recommends that pregnant women avoid exercise that could cause dehydration, which has been associated with a small increase in uterine contractions. 8
However, updated guidance has largely focused on aerobic exercise during pregnancy, with weight lifting remaining largely unexplored. A study on resistance training in pregnant women with gestational diabetes found that, for women who were overweight (prepregnant body mass index > 25 kg/kg/m2), women who engaged in resistance training plus a healthy diet used less insulin and had a longer delay from time of diabetes diagnosis to initiation of insulin therapy when compared to the diet-only group. 9 This is one of the few studies assessing resistance training exclusively as a form of exercise during pregnancy. Metareview analysis of exercise in pregnancy has confirmed the lack of high-quality randomized controlled trial data regarding exercise with weights, such as resistance training. 6 Without evidence-based guidance, many women may turn to resources outside of their obstetrician's office if they have questions or concerns.
More recently, pregnant women have turned to online communities as a place to connect and find recommendations for how to manage their pregnancy. 4 A study by Zhu et al. found that 95% of participants reported a positive mental impact from using social media during pregnancy, indicating that these platforms can enhance mental well-being by providing a sense of community and support. 10 Additionally, a review of 15 randomized controlled trials found that targeted interventions on social media platforms can have a positive impact on weight management, controlling gestational diabetes, and improving maternal mental health. 11 That said, most of the information available on social media platforms is not screened for accuracy, leading to high rates of misinformation. 4 Even in posts that have medically accurate information, such as those described in Hayman et al., user attitudes in the comments section often promoted outdated historical guidelines such as those we discussed above. 4 As such, analyzing user attitudes in social media posts and comments can give insight to the healthcare community about what type of information (both accurate and inaccurate) their patients are likely to encounter.
Our study uses Instagram as an example of a social media platform, due to its mixture of professional and amateur content creators, high user engagement, and use of the targeted hashtag system to allow users to search for specific topics of interest. Instagram, founded in 2010, currently has 2 billion monthly active users and over 500 million daily active users; it is the third most popular social media platform worldwide. 12 The platform allows users to share photos and reels (short videos 15–90 s in length) on their profiles, complete with captions, hashtags (#) for categorizing, and user tags (@). The Instagram search function allows users to search by keywords or specific hashtags.13,14 People can follow as many accounts as they like and enjoy a continuous feed of posts from those they follow, where they can like or comment on content. Instagram also recommends accounts based on previous user interaction. While the focus of Instagram is primarily in photographic and video content, users can express their opinion about the content they are viewing via the comment section.13,14 To analyze the public response to Instagram photos and reels, we elected to use natural language processors (NLPs) to assign sentiment to each comment under a post in an effort to reduce reviewer bias.15–17 Both large language models used (ChatGPT 3.5 and Monkeylearn) allow for the classification of text sentiment as positive, negative, or neutral for even large comments underneath a post or a reel. 17 Analyzing sentiment on social media, particularly Instagram, around prenatal weightlifting offers valuable insight into public perception of this topic and could guide preemptive prenatal counseling about weight training in the future.
Additionally, new and emerging research indicates that social media feeds vary widely between demographics, and that certain images are recommended to individuals of a certain demographic over others. 18 As such, we determined that assessing sentiment regarding weight training in pregnancy would provide valuable insight across the following demographics: creator size, creator profession, skin tone, and continent of residence associated with the post. Firstly, creator size is useful in assessing whether weight training in pregnancy is associated with different sentiments between career influencers who reach millions with their content and smaller influencers across a variety of professions who may create pregnancy content as part of a specific niche. Skin tone is useful in assessing whether attitudes around weight training differ between creators of various skin tones. Existing literature shows that women with darker skin tones experience disproportionately poorer maternal and fetal health outcomes, influenced by systemic inequities and cultural perceptions. 19 While much research has explored clinical contributors, few have considered whether digital representation and public perception, particularly around exercise, could play a role. Investigating how social media users respond to prenatal weightlifting content based on the perceived skin tone of the influencer could offer a novel perspective on how digital bias may reinforce or challenge real-world health disparities. Lastly we wanted to assess sentiment based on nation of residence, further broadened into continents during data analysis, because pregnancy recommendations can vary between healthcare systems.
Our study uses Instagram posts as an example of organic self-directed social media engagement and weight training as an example of a health-promoting behavior that exists without standardized prenatal guidance. We aim to assess pregnant women's engagement with weight-training content on Instagram in two ways. First, we will assess the prevalence of weight training content in various demographic communities. Second, we will look at attitudes expressed around weight training in pregnancy, both in the original content itself and in the response reflected in the comments section.
Methods
A cross-sectional analysis of Instagram comments via hashtags #pregnancystrengthtraining (#PST) and #pregnancyweightlifting (#PWL) was conducted on August 3rd, 2023. A hashtag is an alphanumeric string preceded by a hash (#) symbol; on social media posts, hashtags are typically used to denote wider context or target the post to a specific audience. 20 Hashtags have been found to increase popularity and engagement.20,21 Using a new Instagram account, two independent reviewers analyzed the first 100 posts for each hashtag. Inclusion criteria were Instagram posts or reels with one or both hashtags, which had English language comments. Posts met exclusion criteria if they did not include either hashtag, posts or reels without comments, or posts with comments only in languages other than English. The first 50 photos and the first 50 reels for each tag were reviewed, for a total of 200 posts. Two hundred Instagram posts were identified, 37 posts were excluded, and 169 were analyzed. A selection of Instagram posts is shown in Figure 1. Content was analyzed across two metrics: the demographics of the Instagram influencer and the sentiment of comments under each post.

Identification of Instagram posts.
Comment-related text was extracted from each post and then inputted into MonkeyLearn Sentiment Analyzer (2023). The confidence percentage of the sentiment was recorded. Chi-square goodness-of-fit test was performed to assess observed proportions of sentiments compared to the expected distribution. Comment-related text was extracted from each post and then inputted into ChatGPT-3.5 (July 20 version) using the search “Given this text, what is the sentiment conveyed? Is it positive or negative? Text: {transcript}.” The most commonly occurring sentiment was then selected as the overall sentiment of the transcript. Chi-square goodness-of-fit and independence tests were performed to assess observed proportions of sentiments compared to the expected distribution.
Demographics
Content characteristics, including Profile name, Post type, Influencer type, Creator profession, Age, Influencer size, and Region/Continent, and Skin tone were obtained from each analyzed post, which included 200 total posts and reels. Influencer Size categories were created based on the number of followers each user had. Five categories were created: nanoinfluencer for less than 10,000 followers, microinfluencer for 10–15 k followers, mid influencers for 50–500 k followers, macroinfluencer for 500 k–1 million followers, mega influencer for more than 1 million followers. An influencer is defined as an individual with a large or highly engaged social media following who holds considerable sway in specific industries. Size categorizations for influencers vary widely, especially in academic publications, as the industry is so new. The delineations we created are drawn from an average of “creator size” brackets used by a variety of influencer marketing agencies, since their categories tend to be internally consistent and correspond to similar perceived social media reach between agencies.22–24
For creator profession, five categories were made: professional athlete, athletic trainer, personal trainer, nonprofessional/lifestyle creator, and other. Post type was divided into reel or posts. Region/continent was sorted into North America, Europe, Australia, Asia, Africa, South America, and unspecified. Age categories included 18–34, 35–54, 55+ and unable to say. Channel type was categorized into personal, professional, public, athletic, and parenting based on the overall profile and its content. Lastly, skin tone was determined using The Massey and Martin Scale (Figure 2).25,26 This scale, which includes a range of tones from light to dark, was employed to systematically assess the self-reported skin tones of participants. Each user's skin tone was assigned a corresponding value on the scale, which was then analyzed for demographic and engagement patterns within the study. A table of content subgroups is shown in supplemental Table A in the appendix.

Massey and Martin Skin Color Scale for Instagram users.
For creator profession, a professional athlete account refers to someone who is a professional athlete, such as an Olympian or paid sports player. An athletic trainer is someone who has a master's degree from an accredited athletic training program. They must pass a comprehensive exam administered by the Board of Certification for the Athletic Trainer, and their scope of practice includes prevention, examination, diagnosis, treatment, and rehabilitation of injuries and medical conditions. A personal trainer is someone who has a high school diploma and has received certification from one of many organizations, such as National Academy of Sports Medicine. Personal trainers focus on designing exercise programs to improve fitness levels and do not typically manage acute injuries. Nonprofessional/lifestyle creators are persons with no formal degree or certification in exercise physiology but share their active lifestyle with others on social media platforms. “Other” refers to any additional creator profession that does not fall in the above categories that was encountered.
Channel type was sorted into personal, professional, public, athletic, and parenting. A personal channel is an account that shares the user's personal life. A professional channel includes sharing a user's life, but in the scope of their profession only. A public channel is focused on a broad range of topics and subjects. An athletic channel shares an athlete's life exclusively through the scope of their sport, and a parenting channel shares content exclusively related to the raising of children. Regions/continents were sorted as described above. Place of residence was identified via a publicly available location tag found on the creator's profile. Both flags and names of countries of residence were used for our study. If there was no indication of the user's location, they were categorized as “unspecific.”
Sentiment analysis
Comment-related text was analyzed with two language processing tools for their associated sentiment. NLP is a software tool designed to enable computers to understand, interpret, and generate human language. It bridges the gap between human communication and computer processing. These tools use techniques from linguistics, machine learning, and deep learning to process and analyze large amounts of language data.27,28 Their applications are widespread, from chatbots and virtual assistants to sentiment analysis in social media or customer reviews. ChatGPT is an NLP powered by advanced machine learning, specifically a model called Generative pretrained transformer (GPT). 28 ChatGPT is designed to engage in human-like conversations, answer questions, provide information, generate creative text, and perform various language-based tasks including code generation. It can understand context: by processing the conversation's flow, it generates coherent responses. 28 MonkeyLearn is a machine learning platform specifically built for text analysis and language processing. It focuses on simplifying language processing tasks for nontechnical users, allowing businesses and individuals to extract valuable insights from text data. Some key features of MonkeyLearn include text classification, which allows for categorizing text (e.g. sentiment analysis, topic categorization), and entity extraction, which helps identify key information like people, places, and dates. Monkeylearn also offers accessible text clustering, where similar pieces of text are grouped together based on topics or themes; this process previously required coding knowledge. 29 Comment-related text was inputted into both NLPs and assessed using the process described above.
Results
Survey results were categorized for each hashtag, first by sentiment alone and then by sentiment across several metrics, including skin tone, influencer type, and region of the world. 200 posts were identified and are classified in supplemental Table B.
After applying the inclusion and exclusion criteria, 86 content creators were evaluated for #PST, while 81 content creators were evaluated for #PWL. For #PST, overall sentiment was highly positive regardless of the language model used: ChatGPT categorized 83% of comments as positive, while MonkeyLearn categorized 80% of comments as positive (p < .0001). For #PWL, overall sentiment was still positive, though less so. ChatGPT categorized 62% of comments as positive, while Monkeylearn categorized 75% of comments as positive. Overall, ChatGPT labeled a higher percentage of statements as neutral when compared to MonkeyLearn. Sentiment analysis for each hashtag using both AI models is shown in Figures 3 and 4.

Comparison of sentiment analysis of #pregnancystrengthtraining using MonkeyLearn and ChatGPT.

Comparison of sentiment analysis of #pregnancyweightlifting using MonkeyLearn and ChatGPT.
Regarding sentiment and skin tone, sample sizes were largest for skin tones 1–3 on the Massey and Martin Skin color scale. Tones 1–2 predominated for #PST (n = 37, 43%; n = 25, 29%), and tones 2–3 predominated for #PWL (n = 34, 41%; n = 26, 31% respectively). Notably, while all results were majority positive regardless of skin tone or language model used, very few results were statistically significant. For #PST, no statistically significant association was found between skin tone and the sentiment of associated comments. For #PWL, only Monkeylearn identified a statistically significant association between skin tone and sentiment of comments; ChatGPT also found no association. Within the MonkeyLearn assessment of #PWL and skin tone, skin tones 1, 3, 4, 6, and 7, had 100% positive comments. Sentiment was still largely positive for skin tone 2 (66.67% positive, 29.17% neutral, and 4.17% negative) and skin tone 5 (71.43% positive, 0% neutral, and 28.57% negative). There were no entries for skin tones 8–10. Results are shown in Table 1 for ChatGPT and Table 2 for MonkeyLearn.
Sentiment analysis of skin tone for each hashtags using ChatGPT.
Sentiment analysis of skin tone for each hashtags using monkeyLearn.
Similar to skin tone, sentiment analysis for influencer size was largely positive but with limited significance. ChatGPT sentiment analysis did not identify any significant association between influencer type and sentiment for either #PST or #PWL. Monkeylearn did not identify any significant association between influencer type and sentiment for #PST, but results were significant for #PWL. When assessed by Monkeylearn, content produced by nanoinfluencers received 94.67% positive sentiment, 1.33% neutral sentiment, and 4% negative sentiment across 75 posts. Microinfluencers had more mixed results, with only 66.7% positive and 33.3% negative sentiment. There was no activity in #PST or #PWL from mid, macro, or mega influencers. We did find that the majority of content creators for both of these hashtags were from athletic trainers, with athletic accounts being responsible for 62% of content under #PST and 54% of content under #PWL. Results are shown in Table 3 for ChatGPT and Table 4 for MonkeyLearn.
Sentiment analysis of influencer size for each hashtags using ChatGPT.
Sentiment analysis of influencer size for each hashtags using monkeyLearn.
Regarding sentiment and region, results are largely insignificant, likely due to low regional identification among users. While some users indicate their primary country of residence, almost half of the posts under both #PST (47.7%) and the majority of the posts under #PWL (79.0%) did not have an identified region. Among posts where regions could be identified, Europe had the highest representation, followed by North America. Under both language models and both hashtags, all North American posts received comments with 100% positive sentiment. Interestingly, ChatGPT identified a much higher percentage of positive sentiments for posts identified with European creators for #PST (96.43%) when compared to #PWL (61.54%). By contrast, Monkeylearn identified almost exclusively positive sentiment with both hashtags: 92.86% for #PST and 92.31% for #PWL. Given the low sample size and lack of statistical significance of sentiment when separated out by region, we do not find the lower positive sentiment for ChatGPT analysis of #PWL in Europe-identified posts to be noteworthy. Results are shown in supplementary Table C for ChatGPT and supplementary Table D for MonkeyLearn for all regions.
Discussion
Regarding overall sentiment, we found that both language learning models identified majority positive sentiment associated with both hashtags. ChatGPT identified 83% of comments as positive for #PST and 62% of comments as positive for #PWL. Similarly, MonkeyLearn identified 80% of comments as positive for #PST and 75% of comments as positive for #PWL. The difference in the two language models identifies that ChatGPT is less likely to assign positive sentiment relative to MonkeyLearn, instead classifying those sentiments in the neutral category. Additionally, both models identified that fewer positive comments were associated with weight lifting, as opposed to strength training. Lastly, it is important to note that the high percentage of positive comments may not be organic. Instagram allows content creators to delete comments under their posts. While a viral video may have comments that appear so rapidly that editing is not possible, posts with smaller numbers of total comments, like those in our study, could have had their comment section edited.
Sentiment analysis data reveals a strong skew toward positive sentiment across all skin tones, and chi-square tests of independence between skin tone and #PWL were significant for the MonkeyLearn model. Additionally, #PWL did have a slightly darker skin tone predominance (tones 2–3) when compared with #PST (tones 1–2). Attitudes around weight lifting and skin tone may be a subject of future study, because our findings here are limited due to sample size. Within a specific skin tone, our results regarding sentiment lose their significance. For example, all negative comments under #PST were confined to lighter skin tones (tones 1–2), but sample sizes dropped precipitously after skin tone 2. Skin tones 1–2 make up 72% of the dataset for #PST, and there are only 10 posts total for skin tones within the medium and dark ranges (tones 3–7). The lack of representation for darker-skinned users in top posts is a limitation of our study. The predominance in negative comments associated with posts that feature lighter-skinned creators may be due to a genuine difference in sentiments around strength training and weight lifting in various communities, due to differences in beauty standards. However, this skew in results could also be a combination of small sample size and randomness.
Of note, the lack of posts featuring darker-skinned creators could point to a limitation of the Instagram algorithm itself. We are unsure at this time whether Instagram specifically recommends posts from lighter-skinned users over others when searching for hashtags, or how their algorithm incorporates skin tone into postsorting, if at all. Other possibilities include a lower engagement with the hashtags in certain communities of color. A study by Paradkar et al. found that darker skin tones were represented less frequently than lighter skin tones at the top of search results for dermatology influencer posts of 2019. 30 While findings from a single study have limited applicability, it does confirm some of the disparities we observed in our research. Future studies should explore the Instagram algorithm across a broader range of hashtags with larger sample sizes and controlled follower counts to rule out bias in post sorting. Additionally, future studies could query Instagram users of various skin tones to track their engagement with the platform and assess concordance with creator skin tone and type of content. This would assess whether the comments predominating under posts with light-skinned creators is due to a variation in levels of engagement from their predominant user base.
Our analysis for both influencer size and type of account yielded two large trends. First, we found that for both #PST and #PWL, nanoinfluencers had the highest volume of posts. Second, we found that athletic accounts were responsible for 62% of content under #PST and 54% of content under #PWL. Regarding sentiment, we found the Monkeylearn language model under #PWL did find a significant amount of engagement with nanoinfluencers, when compared to creators with larger follower counts. Other hashtags, account type, and the ChatGPT language model did not yield statistically significant results. We also did not find any statistically significant trends for sentiment in either language model when plotted against the country of residence for the content creator.
Taken in conjunction, these trends suggest that disseminating health information would be relatively easy for a health organization to accomplish, so long as specific hashtags are used. A large number of followers are not required to appear in the top 50 posts under a hashtag, and while athletic trainers were the predominant type for content creators, ultimately sentiment under those posts was not significantly different from sentiment under posts created by laypeople. A physician or research group could create informational content, user specific hashtags, and instruct obgyns to direct their patients to search those hashtags if they have questions about a particular prenatal topic with relative confidence that those posts would appear near the top of the search results. Given that the United States is facing an increasing healthcare shortage, providing pregnancy recommendations in a way that is professionally produced but accessed organically may be a way to widely disseminate medically accurate information.
Conclusion
Our study aimed to determine popular sentiment around pregnancy and strength training using popular social media platforms as a proxy for general sentiment among pregnant women. Results indicate that, when searching by #PST and #PWL, sentiment in the comments section was majority positive, regardless of the creator's skin tone, professional occupation, follower size, or the region of world where they reside. The overwhelmingly positive sentiment suggests that pregnant women are interested and engaged with weight training in pregnancy. That said, results lacked significance regarding multiple influencer demographics, including skin tone and country of residence associated with the post. This could be due to algorithmic amplification of certain voices, particularly of lighter-skinned European creators, or due to lower engagement in other communities. More research is needed to determine whether sentiment around weight training in pregnancy is positive regardless of one's demographic background or if there are differences that were not revealed within our dataset. Regardless, the high preponderance of positive responses indicates that women are positively engaged with pregnancy weightlifting content and that physicians and health providers could potentially leverage that interest towards positive outcomes with targeted interventions on social media platforms.
Supplemental Material
sj-docx-1-dhj-10.1177_20552076251383603 - Supplemental material for Pregnancy, power, and perception: An AI analysis of Instagram sentiment on prenatal weightlifting
Supplemental material, sj-docx-1-dhj-10.1177_20552076251383603 for Pregnancy, power, and perception: An AI analysis of Instagram sentiment on prenatal weightlifting by Grace Basralian, Claire Wolford, Kaitlyn Voity, Sharon Galperin and Antonia F Oladipo in DIGITAL HEALTH
Footnotes
Ethical considerations
There are no human participants in this article and informed consent is not required.
Author contributorship
All authors on the manuscript contributed to the design and implementation of the research, to the analysis of the results and to the writing of the manuscript.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Guarantor
Antonia F. Oladipo MD, MSCI accepts full responsibility for the work and/or the conduct of the study, had access to the data, and controlled the decision to publish
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
