Abstract
Social media platforms offer people a variety of options to engage with politics, from directly following elected officials to discussing politics with social peers. Despite major advances in recent research on online political exposure through the lens of selective exposure, filter bubbles, and ideological echo chambers, little is known about the fundamental questions of what types of political actors people are exposed to on social media, and how these distinctive types vary across sociodemographic groups. We address this gap in the literature by analyzing unique panel data on more than 600,000 registered U.S. voters on Twitter during the 2020 U.S. Presidential campaign. We analyze this dataset to identify distinct types of political consumers and how they vary in terms of sociodemographics. Our findings suggest that the bulk of the population has a meaningful share of political content available from social peers, that the majority of this content originates from traditional sources of political information (media organizations, journalists, and politicians), and that media organizations are the dominant and direct source of political information on Twitter for nearly 20% of the sample population. These results advance our understanding of the way citizens learn about politics in new media and pave the way for next-step research to identify the causal effect of exposure to distinct curators of political content on individuals’ political attitudes and political behavior.
Introduction
The tectonic shifts in the media environment and the rise of social media platforms over the past two decades significantly changed the ways in which people are exposed to news and political information worldwide (Fletcher and Nielsen 2018). This trend has been particularly swift in the United States, as Americans are now exposed to news more often on social media than in print, and for younger generations, social media has become the dominant channel for news (Shearer 2018).
In this increasingly networked media environment, the information that populates one's feed is an amalgamation of curation decisions taken by others, including social peers, journalists, politicians, advertisers, and proprietary ranking algorithms (Thorson and Wells 2016). For example, Bakshy et al. (2015) showed how selective exposure on Facebook is partially determined by Facebook’s Newsfeed ranking algorithm and determined more dominantly by the individual’s choice of whom to follow. Of course, the effects of social media and digital media use writ large extend beyond the online world, and a growing body of research shows mobilization effects, where digital media use is associated with more traditional forms of political participation offline (Oser and Boulianne 2020; Vaccari et al. 2015; Vaccari and Valeriani 2021). Therefore, it is no surprise that issues of power and control (Barzilai-Nahon 2008), limits of free speech (Morrow et al. 2021), and individual choice (Bakshy et al. 2015; Robertson et al. 2023) in political exposure on social media are some of the most contested topics of our time.
Despite the clear importance of advancing scholarly and real-world knowledge about political exposure on social media, we know relatively little about two key parameters of political exposure, namely: the prevalence of different types of actors in the stream of political content people get from their ego-networks, and how political exposure varies across sociodemographic groups. Currently, no social media platform provides precise individual-level or comprehensive aggregate-level information about exposure, which poses a key obstacle to the field for advancing research on these crucial topics. The “Social Science One” initiative (King and Persily 2020) does provide aggregate information about viewership, but this information is limited in several important ways, namely: it is currently limited to Facebook data, includes only URLs and not all political content, does not distinguish eligible from noneligible voters, and does not provide information about the person who posted the content. In lieu of more precise measurement, researchers have made recent contributions on these topics by relying on self-reported measures of political consumption and general-purpose web tracking data, while acknowledging the serious selection bias challenges that are inherent to this approach (Guess 2021; Wojcieszak et al. 2022b).
In this study, we build on the contributions of this extant literature on political exposure by using a research design that leverages a large panel of U.S. registered voters and their activity on Twitter. The combination of these two data sources—that is, Twitter data and registered voter data—creates the opportunity to ask and answer research questions that correspond to the two key parameters of political exposure noted above as requiring scholarly attention, namely: What are the types of political exposure on social media from different types of actors (RQ1)? And how do these types of political exposure vary across sociodemographic groups (RQ2)? To answer these research questions, we build on the curated flows theoretical framework developed by Thorson and Wells (2016) to identify the political content that is available to registered voters on Twitter and curated by different actors, including media organizations, journalists, politicians, opinion leaders (OLs), and social peers. We do so by using clustering methods that identify the prototypical modes of political exposure in terms of the breakdown of the actors responsible for this content exposure, and by identifying the sociodemographic covariates of each distinctive cluster.
Using this novel methodological approach, our contributions are twofold. First, we provide new empirical evidence about the prototypical modes of political exposure—both in terms of quantity of political content and composition of different actors who curate this content—by a large and representative sample of registered U.S. voters on Twitter. Second, we present findings on the varying levels and compositions of political exposure by different sociodemographic groups of registered U.S. voters on Twitter. Taken together, our contributions begin to address some of the most basic, yet unanswered, questions at the heart of the curated flow framework and social media communications: who are the most significant curators in political communication and for whom.
The Importance of Political Exposure Online and on Social Media
Numerous studies show that online political exposure and information consumption on social media are related to political attitudes and behaviors, both online and offline. For example, Valeriani and Vaccari (2016) found that accidental information on social media is positively associated with online political participation in multiple national contexts. A recent meta-analysis concluded that incidental exposure, an unintended form of exposure that is common on social media, is positively associated with a variety of pro-democratic attitudes and behaviors including news use, political knowledge, political participation, expressive engagement, and political discussion (Nanz and Matthes, 2022). In contrast, overreliance on the news to find you on social networks is negatively associated with important sociopolitical indicators of political knowledge, political interest, and voting (Zúñiga and Diehl 2019). Weeks et al. (2017) further find that counterattitudinal incidental exposure on social media drives processes of selective exposure among stronger partisans, which subsequently leads to greater political information sharing.
A stream of recent studies informed by Thorson and Wells’s curated flows framework has shown that the impact of political messaging also depends on the type of actor who is delivering it, as the same political message received from different types of sources may have a different impact on attitudes and behavior. For example, recent research indicates that statements by celebrities and online influencers seem to affect the public’s real-world beliefs compared to similar statements by noncelebrities (Alatas et al. 2019; Alrababa’h et al. 2021; Suuronen et al. 2021). In the realm of media sources, research has shown that high levels of exposure to media outlets with high levels of political content shape political knowledge and behavior, including the propensity to vote (de Vreese and Boomgaarden 2006). Turning to the domain of peer networks, research by Graham et al. (2015) showed that over half of the political discussions in online forums in the U.K. led to at least one political action. The importance of the clear identification of actors is evident in Taylor et al.’s (2022) large-scale longitudinal field experiment which showed that content provided by anonymous sources is less impactful on viewers’ opinions and behaviors compared to content shared by identified individuals with known reputations. Taken together, this emerging research indicates that the messenger’s identity may be as important as the message itself.
Who Is Heard and by Whom in Political Communication
A central element in democratic theory is the expression of ideas to allow public information-sharing and deliberation (Habermas 1984). While research on the ways citizens construct their information diets certainly precedes the digital era (Katz et al. 1974; Sears and Freedman 1967), the shift to online media—accompanied by the weakening of traditional gatekeepers, and the context collapse that is common on social media (Davis and Jurgenson 2014)—calls for renewed attention to the fundamental question of who is being heard in modern political communication. Addressing this question is important for advancing our understanding of the extent to which social media, and information systems more broadly, fulfill their egalitarian potential (Allen 2015) or reinforce old political structure and power as the weapon of the strong (Hindman 2009).
As noted, the theoretical and empirical importance of examining who is being heard is highlighted by Thorson and Wells’ (2016) discussion of the role of individual-level “curation” for understanding media exposure and its effects. While individuals choose whom to follow, the notion of curation emphasizes the agency of external actors over the composition of one’s social media feed. In particular, the curated flows framework lists a number of key actors including social peers, journalists, politicians, advertisers, and proprietary ranking algorithms. Merten (2021) explored the decisions (e.g., follow, block, or hide) users report taking in response to news curation by others. However, there is little empirical work that shows the relative prevalence of different actors in individuals’ political exposure (Wells and Thorson 2017). Two notable exceptions are the recent work by Wojcieszak et al. (2022b), which sheds new light on the channels (search engines, social media, aggregators, etc.) that lead people to news, and the work by Jürgens and Stark (2022), which measured the diversity of news accessed through different channels. Nevertheless, we need to follow once more Prior’s (2009) call for better measurement of news exposure to advance our understanding of the media effects of social media and gain better understanding of the ways political learning takes place on such social platforms (Bode 2016). Currently, little is known about the amount of political content people are exposed to on social media, and the different kinds of actors in conveying this information. Therefore, our first research question is the following:
A key element in the composition of political exposure is political ideology and the range of ideas being represented. Some recent work indicates that exposure to political content through online social networks may serve to increase political polarization (Bail et al. 2018; Garrett et al. 2014; Shmargad and Klar 2020). Yet, other studies indicate that social media exposure through weak ties and the visibility of social endorsements reduce polarization by offering diversity of exposure (Barberá 2015; Messing and Westwood 2014). People follow more frequently media and politician accounts that align with their ideology (Eady et al. 2019; Wojcieszak et al. 2022a), but there is still substantial overlap in people’s news diets (Guess 2021). Particularly because there is no consensus about the polarizing effects of media or social media (Prior 2013; Zhuravskaya et al. 2020), it is important that we refine our understanding of political exposure on social media and consider it jointly with political ideology.
Sociodemographic characteristics are also linked to political consumption. Consistent with Vaccari and Valeriani’s (2021) call to move beyond the “one-effect-fits-all” fallacy, we draw on prior literature to assess how key sociodemographic characteristics relate to distinctive types of political exposure. First, there is a well-documented age gradient observed in the level of interest in politics and the self-efficacy of individuals (Verba et al. 1995). As younger generations increasingly get their news on social media (Shearer 2018), it is important to study the types of political content they are getting. In general, research shows that those with traditionally advantaged sociodemographic backgrounds (e.g., male, older) are more active politically, including efforts to seek out political content (Schlozman et al. 2018). Yet, research suggests that social media and online participation may have differential mobilization effects that recruit younger groups and women more actively into politics (Oser et al. 2013; Oser and Boulianne 2020). A possible reason for that is that publics and counterpublics pay attention to different issues on social media (Jackson and Foucault Welles 2015; Shugars et al. 2021). Therefore, our second research question is the following:
We now turn to the methodological challenges and opportunities for making robust inferences about the political exposure of citizens on social media.
Measuring Political Exposure in the Digital Age: Challenges and Opportunities
Although survey data have long been a leading source of information about habits of political consumption, researchers are actively looking for ways to improve their accuracy (Berinsky 2017; Guess 2015). In the context of social media, prior work showed that there could be large discrepancies between actual and reported frequency of posting about politics (Guess et al. 2019; Henderson et al. 2021).
Digital trace data provide new and complementary ways to measure individuals’ behavior directly, often collected through dedicated software installed by participants (Flaxman et al. 2016). For example, Guess (2021) uses web browsing data combined with survey responses to characterize Americans’ media consumption habits and examine whether internet use indeed facilitates selective exposure to like-minded views. While this approach provides the most comprehensive picture of both objective and subjective measures of political engagement, it is often limited to a few thousand participants who are willing to volunteer their data. In addition to selection issues, the sample quickly becomes statistically underpowered for obtaining accurate descriptions of subgroups and heterogeneity of activity (Hughes et al. 2021). This challenge of directly measuring political exposure for the field as a whole is clearly identified in Amsalem and Zoizner’s (2022) observation in their comprehensive meta-analysis of learning about politics on social media that most relevant studies do not include any direct measure of political exposure, and also lack sufficient sample size to estimate heterogeneous effects.
A recently developed alternative approach for directly gathering data on individuals’ behavior is to use publicly available social media data. Despite meaningful changes in Twitter’s leadership and policies beginning in 2022 (Anderson 2023), it has been a uniquely important social media platform for investigating exposure to political content of a large sample of users due to the active engagement of media outlets and political figures on the platform up through and including the observation period of the current study (Bail et al. 2018; Barberá 2015; Eady et al. 2019; Guess 2021). In 2021, around one-in-five (23%) of Americans reported using Twitter (Odabaş 2022), and almost seven in ten of them said they receive their news regularly through the platform (Mitchell et al. 2021). While Twitter users in the U.S. were found to be younger and more likely to be Democrats in comparison to the general public (Wojcik and Hughes 2019), prior work has shown that differences between Twitter users and nonusers are due mostly to the demographic composition of social media users, which can be addressed by controlling for a few demographic variables (Mellon and Prosser 2017). Importantly, rigorous empirical work on the representativeness of Twitter users shows some modest demographic differences between Twitter users and the general population that can be accounted for analytically (Hughes et al. 2021).
This recently developed methodological approach of analyzing publicly available social media data is an important contribution to extant literature, as no social media platform currently offers public access to data about individuals’ exposure to distinctive types of political content, and as a result hardly any research has directly measured it. An increasingly prominent approach for approximating exposure involves the collection of content posted by accounts followed by the focal user on social media (Eady et al. 2019; Grinberg et al. 2019). As described in Grinberg et al. (2019) this approach does not guarantee exposure, that is, that an individual actually saw a particular post, but it does directly speak to the content available to people in their social feeds from their ego-network.
Building on this discussion of the current state of the art in research on political exposure, the following section details how the current study applies this recently developed novel methodological approach to measuring potential political exposure.
Data and Methods
Twitter Panel and Political Exposure
The foundation of this research is a sampling frame of over 1.5 million Twitter users that were successfully matched to public U.S. voter registration records. Following the same approach described in prior work (see Grinberg et al. 2019 and Shugars et al. 2021 for more details), the matching used the Twitter Decahose, a 10% random sample of all tweets, to identify 290 million profiles who posted content between January 2014 and March 2017. The profiles were then matched against voter records provided by TargetSmart in October 2017 for all 50 U.S. states and the District of Columbia. A Twitter account was matched to a voter record if their full name exactly matched and they were the only person with that name in either the city- or state-level geographic area specified in both datasets. While the reliance on full names and disclosed locations eliminates many fake, automated (bot), and organizational accounts, it does raise concerns about potential selection bias. However, rigorous comparison of this panel with a gold-standard survey conducted by Pew Research Center showed that only small demographic and ideological differences exist between the two samples of registered U.S. voters (Hughes et al., 2021). Importantly, this matched dataset provides comprehensive data on individuals’ social media behavior through Twitter, as well as the basic sociodemographic information. Age and gender are retrieved directly from public voter registration records, while race/ethnicity and party affiliation are based on TargetSmart inferences (see validation in Shugars et al. 2021, Supplemental Appendix B).
The primary dataset used in this work is a set of 606,112 panel members for whom we have at least one indication of activity on Twitter during the 2020 presidential election (August to November, inclusive). This set includes all individuals who posted or liked at least one tweet during the four months of the study period, which is important for capturing not only highly active users but also users who rarely or never post on Twitter but still consume content. Our target population is, therefore, restricted to registered U.S. voters who were minimally active on Twitter during the 2020 presidential election, and we make no claims about the important, yet omitted, populations of eligible nonregistered voters or inactive Twitter users. Appendix A in the Supplemental Information file provides sociodemographic information about these active panel members. By focusing on a period of a presidential election, we examine potential political exposure at its peak (Grinberg et al. 2019; Peterson et al. 2021), when it matters, and where most politically relevant actors are likely to be active.
To model potential exposure, we follow the approach used in prior work to approximate individuals’ social feed using the content available from the accounts they follow (Eady et al.,2019; Grinberg et al. 2019). Due to Twitter rate limits, we could not collect all tweets posted by the 51 million users followed by our sample during the observation period, and base our estimates on the 10% random sample of the Twitter Decahose, similar to the approach used in Grinberg et al. (2019). It is important to note that this approach only captures potential exposure and not exposure per se, partially due to the sampling of followees’ content, but more importantly because exposure requires knowledge about user activity and the outcomes of personalized ranking algorithms—two ingredients that are unavailable to the research community. Nevertheless, in lieu of more precise information from social media platforms about exposure, this approach reflects the most accurate and reproducible estimate currently available to the public about the composition of people's social feeds from their ego-network.
To identify political tweets, we train a machine learning classifier and validate its accuracy against human coders, similar to the approach used in prior work (Bakshy et al. 2015; Eady et al. 2019; Grinberg et al. 2019). The classifier resulted in a precision of 88.8 percent, a recall of 80.0 percent for tweets about U.S. politics, and a recall of 96.4% in the subcategory of election-related tweets. More details about the classifier and its validation are in Appendix B in the Supplemental Information file.
Identifying Different Actors in Political Exposure
Following the curated flows framework, we examine different types of actors that curate political content for individuals in our panel. We focus on four types of actors directly mentioned in Thorson and Wells’ (2016) framework—media organizations, journalists, politicians, and social peers—and include a fifth category of “opinion leaders” who have been identified as important in recent research. Specifically, OLs have large followership on social media, nonpolitical even more than political OLs (Mukerjee et al. 2022), and a demonstrated ability to influence public opinion (Alatas et al. 2019). To date, however, the share of political content originating from OLs’ accounts in day-to-day political exposure has not been directly quantified.
To identify accounts of different actors, we rely on manually curated lists of accounts by recent academic works, develop methods to identify additional accounts, and validate the accuracy of our inferences and the robustness of results. We identify media organizations by using the list of media organizations in McCabe et al. (2022), which started with a seed list of known media organizations and used snowball sampling to expand it iteratively. We supplement this list with the media organizations listed in Wojcieszak et al. (2022a). We also rely on Wojcieszak et al.’s (2022a) extensive list to identify 1,951 journalists’ accounts.
Politician accounts are identified through an original list we compiled by linking an official list of the 116th Members of Congress (MoC) names to a list of MoC accounts on Twitter (Wrubel and Kerchner 2020). Our list of 927 accounts includes both the accounts of MoC and their election campaigning accounts, which is important for capturing all messages originating from politicians during an election cycle. We supplemented this list with 51 additional politician accounts found in Wojcieszak et al. (2022a). Appendix C in the Supplemental Information file details our identification strategy for politician accounts.
To identify OLs, we rely on the manually labeled list of accounts of nonpolitical OLs (e.g., public figures, popular brands, celebrities) by Mukerjee et al. (2022), and extend it using Bail et al.’s (2018) approach of considering as an OL any account followed by fifteen or more active MoC. Since the accounts followed by multiple MoC may themselves belong to media organizations, journalists, or politicians, we use a combination of automatic and manual annotation of accounts using profile information to distinguish OLs from other actor types. Validating this approach using a held-out random sample of accounts showed an accuracy of 80.0 percent, which is considerably higher than random assignment with four categories. Appendix D in the Supplemental Information file details our identification strategy for OL accounts.
We consider as a social peer any account that does not appear on any of the aforementioned lists of media organizations, journalists, politicians, or OLs. Importantly, however, the same content can be attributed to multiple people due to the complex nature of retweets, quotes, replies, or mentions. To support different attributions of content and interpretations of the results, we distinguish between direct and indirect exposure. Direct exposure comes from directly following the accounts of media organizations, journalists, politicians, or OLs. Indirect exposure is mediated through social peers who retweet, quote, mention, or reply to a tweet by these actors. To complete our documentation of the analytical work we conducted to identify distinctive actors in political exposure, Appendix E in the Supplemental Information file validates account inferences and robustness, and Appendix F provides summary information about all curating actors analyzed in this study.
Measuring Political Alignment
Modeling the ideological leaning of news content and politicians is fundamental to assessing people’s online media diet, and different methods have been proposed for this purpose. Our analysis of political alignment focuses on three aspects of citizens’ political exposure: first, exposure to left- and right-leaning MoC; second, exposure to left- and right-leaning OLs; and third, exposure to political news sites based on the ideological alignment of people who share links to this site.
For Members of Congress, we consider their party affiliation to be representative of their political leaning, excluding four Independents and one Libertarian. For OLs, we infer their political leaning based on the composition of MoC that follow them. For news sites, we follow Bakshy et al.’s (2015) approach by estimating the political leaning based on the political alignment of people sharing links to the website. Appendix G in the Supplemental Information file provides further detail on our estimation of the political alignment of news sites.
Clustering Methodology to Infer Prototypical Types of Political Exposure
We use state-of-the-art clustering methods to identify prototypical types of political exposure on Twitter. To this end, we take into account three categories of information about panel members’ political potential exposure, namely: (i) the overall magnitude of political exposure and its share out of all content from peers, (ii) the curating sources (partitioned by direct and indirect exposure), and (iii) ideological leaning of news sites in the feed. The first two categories directly speak to RQ1, while the third category enables the clusters to capture political alignment of content, due to the importance of this dimension. Appendix H in the Supplemental Information file provides the full list and description of the fifteen features that we analyze that provide measures of these three key categories of political exposure.
To meaningfully identify clusters in this high-dimensional data, we follow the standard practice in machine learning of reducing the dimensionality of the data first (Allaoui et al. 2020; Grootendorst 2022), and only then apply the clustering algorithm. Specifically, we use Uniform Manifold Approximation and Projection (UMAP) to reduce dimensionality (McInnes et al. 2020), and then apply the clustering algorithm of HDBSCAN (McInnes et al. 2017). HDBSCAN is well-suited for our task because it can identify clusters of any geometry (in contrast to algorithms like K-means that assume a Gaussian distribution), and because clusters are formed based on density, starting with the densest areas first and then split or prune sparser sub-areas (see McInnes et al. 2017 for more details). This approach results in clusters that capture the more common types of political exposure in our sample and is robust to outliers.
Results
In this section, we report results regarding our two research questions: (i) What are the prototypical types of political exposure on Twitter? and (ii) How does the distribution of these exposure types vary across distinctive sociodemographic groups?
In order to identify robust patterns of political exposure, users that did not meet a minimum threshold of political exposure were assigned to a separate cluster of “nonpolitical” users. Consistent with prior research, we set this threshold at one observed political tweet a day on average in the decahose, that is, a total of 122 observations throughout the entire election period (Grinberg et al., 2019). Based on this criterion, 8.9 percent of the population was directly assigned to the nonpolitical cluster.
Clustering the political exposure of users with a minimal amount of exposure to politics resulted in seven clusters that cover 99.1 percent of the population, 0.4 percent of accounts that the algorithm identified as outliers, and three small clusters with several hundred people that together amount to 0.5 percent of the population. Hereafter, we omit these outlier and small-cluster accounts from further analysis and focus on the core exposure patterns identified in 99.1 percent of the population.
Figure 1 presents the prototypical types identified by clustering the political exposure of panel members. Each point in Figure 1A represents an individual and their political consumption at the reduced two-dimensional space computed by the UMAP algorithm with its color designating its cluster assignment. Points that are closer together represent individuals with similar properties of political exposure. Figure 1B shows the median amount of political exposure available in people’s ego-networks 1 and its share out of all content available to people on Twitter, for each cluster separately. For example, the cluster referred to as “media superconsumers” consists of 4.7% of the population, and this cluster’s median user has nearly 6,000 political tweets available to them each day, which comprises 52 percent of their total daily tweet exposure. Cluster labels reflect our assessment of the distinctive features of each cluster in terms of size in the population, the amount of political exposure it represents, and the composition of political curators and sources in the cluster. In particular, we labeled the clusters as follows: one cluster as nonpolitical due to low level of exposure to politics (not meeting our minimal threshold), one cluster as OLs oriented due to an elevated level of exposure to opinion leaders, one cluster as average consumers based on its large share in the population (50.5%), two clusters as partisan due to the political alignment of content, and three clusters based on elevated levels of media consumption of increasing degrees.

Prototypical types of individual political exposure.
Figure 1 provides two key observations. First, we observe that the bulk of the population has a meaningful share of politics in their Twitter feeds. This finding is consistent with prior work that shows that Twitter users are above average in their political engagement (McClain et al. 2021). Except for the two clusters with the lowest share of politics (Nonpolitical and OL Oriented), all other clusters, which account for nearly 90 percent of the population, have 8 percent or more of politics in the feed. Even if this finding only applies to registered U.S. voters on Twitter and to a lesser extent to other social media platforms, it presents a picture of an engaged public during an election cycle. Second, we observe that the Partisan Left and Partisan Right clusters exhibit very similar levels of political consumption to one another, and that the media-oriented clusters, which combine to 15–24 percent of the population, have a larger share of politics in the content available from social peers than the partisan clusters. Next-step research can use these findings to investigate the causal relationship between the overall level of direct and indirect media exposure and subsequent attitudes and behaviors such as ideological polarization.
In addition to the overall level of exposure to politics, the clusters we identified vary in the composition of political exposure from distinctive curating actors. Figure 2 shows the breakdown of political exposure by different actor types including media organizations, journalists, politicians, OLs, and social peers. Lighter-colored bars indicate indirect exposure, where the focal user received content from a peer that referred to a media organization, journalist, politician, or OL (i.e., through a “retweet”). For example, the group of average consumers receives more than four times indirect exposure (22.8%) than direct exposure (5.2%) to politicians. In stark contrast, the Media Superconsumers group receives nearly 90 percent of their political exposure directly from media organizations with hardly any indirect exposure. A similar pattern appears for the OL Oriented cluster that gets more than 70 percent of its political exposure directly from OLs. Figure E2 in Appendix E in the Supplemental Information file provides alternative ordering of the bars, first by direct and indirect exposure, then by actor type.

The composition of political exposure across clusters.
Focusing again on the six clusters representing the bulk of the population (average, partisans and media consumers; nearly 90%), Figure 2 provides three key observations. First, it shows that more than half of political exposure for these clusters comes from traditional sources of political information—media organizations, journalists, and politicians—and that share increases with the increased share of politics in the feed (reflected in cluster ordering from left to right as shown in Figure 1). Second, the clusters also vary considerably in terms of direct and indirect exposure. Nonpolitical consumers are only indirectly exposed to traditional sources, while average consumers get most political exposure indirectly, through social peers and not directly from traditional sources. Partisans have the largest share of political exposure directly from politicians and journalists, and relatively little direct exposure to media organizations compared to the media oriented clusters. Finally, leaving aside the more extreme superconsumers, we see that the Media Oriented and Media Oriented++ clusters, which combine to nearly 20 percent of the sample population, get about half of their political exposure directly from media organizations. Taken together, these findings highlight the importance of considering both direct and indirect exposure to traditional sources as well as OLs, particularly for people who receive a smaller share of political content.
We now turn to our second research question, which focuses on how different sociodemographic groups engage with different types of political consumption. Figure 3 shows how age, gender, race/ethnicity, and party affiliation (y-axes) are distributed across the different exposure types (x-axis; following the same order of increasing share of political exposure from left to right). Specifically, the figure shows the average age estimate for each cluster, and the percentage of women, Caucasians, and registered Democrats in each cluster (see Appendix A in the Supplemental Information file for further detail on sociodemographic characteristics). The dashed horizontal line in each panel designates the sample average as a baseline for comparison.

Sociodemographic characteristics among different political exposure types.
Figure 3 provides several key observations. First, there is a clear positive association between the average age and the share of political exposure of the clusters, as the literature generally predicts at the individual-level (e.g., Verba et al. 1995). Figure I1 in Appendix I in the Supplemental Information file further supports this using the full age distributions. If our cluster-level findings persist at the individual-level, it suggests that age is linked with different overall amounts of political exposure and composition of curating actors. It should be noted that cluster- and individual-level results may not be aligned due to within-cluster heterogeneity. Second, the figure shows meaningful gender and race/ethnicity differences between Partisan Right and Partisan Left. The fact that the two partisan clusters have different demographic characteristics, yet a similar breakdown of actors in their political feeds, suggests that there may be some commonalities in the polarization processes across political ideology. Third, the OL Oriented cluster is distinctively young, male, and non-Caucasian. Together with its small size in the sample (1.7%) and overrepresentation of OLs, this seems like a niche cluster that gets exposed to politics incidentally through nonpolitical OLs.
Finally, we find that the Media Oriented and Media Oriented++ clusters, which together combine to nearly 20 percent of the sample population, have significantly higher percentages of women, registered Democrats, and older adults. Prior work has documented a partisan gender gap in American politics (Doherty et al. 2018), with women more likely to identify as Democrats. However, to the best of our knowledge, no prior work has shown such large gender differences in political consumption directly from media organizations.
Discussion
Much of the discussion about societal factors that may be contributing to democratic backsliding in advanced democracies—including rising populism, decreasing trust in media and political establishment, increased polarization, and misinformation—has been linked to the increased prevalence of digital media, and in particular, to social media. Social platforms are, indeed, widely adopted as a source of political information and a primary source for many young adults. These trends in political content exposure call for a better theoretical understanding of political exposure on these platforms, including next-step causal examination of the impact of different types of political exposure on subsequent political attitudes and political behaviors. Robust analysis of these phenomena requires new computational methods for making valid inferences based on digital trace data that complement traditional methods.
Grounded in the curated flows theoretical framework, this work contributes to the conceptualization and measurement of actors responsible for this curation. The empirical findings describe the types of actors that are responsible for political content distribution to registered U.S. voters on Twitter, and the demographic characteristics of distinctive types of political consumers. We found that the bulk of the population on the platform was exposed to non-negligible amounts of political content during the 2020 U.S. Presidential election, ranging on an average day from eighty-seven political tweets (8% of the overall feed) to a few thousand political tweets (52% of the overall feed). Notably, more than half of political tweets originated from traditional sources of political information—media organizations, journalists, and politicians. The observational findings of the current study pave the way to investigate the causal impact of political content curation by distinctive actors on people’s subsequent attitudes and behaviors, such as left–right ideological polarization and affective polarization.
Another key finding is that media organizations are an important source of political information for a large proportion of the sample, with much of this exposure taking place directly and without any mediation by peers. These findings contribute to the debate about the erosion of traditional gatekeepers, as most media organizations on our lists have, fundamentally, the same editorial processes that Kurt Lewin (1943) wrote about when he first introduced Gatekeeping theory. Our results show that a substantial number of modern consumers of political content on Twitter choose to replicate traditional gatekeeping in new media. Future research could investigate the curation roles and impacts that these media-oriented individuals have on their local network and examine the role of media organizations in influencing subsequent political attitudes and behaviors of specific sociodemographic groups.
Along with these contributions, this research has several important limitations previewed earlier in the study. First, while the findings are likely to capture political exposure of American adults on Twitter in 2020, which were about a fifth of American adults (Odabaş 2022), without direct measurement on other platforms and populations it is unclear how these findings will generalize. On the one hand, previous research had found some similar media effects to Twitter and the more widely used Facebook (e.g., Valenzuela et al. 2018). On the other hand, numerous studies have emphasized the importance of considering specific contextual features in the relationship between social media use and political behavior (e.g., Vaccari and Valeriani, 2021). Additional comparative research is needed to fully contextualize these findings. A second key limitation is the empirical focus of our analysis on potential political exposure, meaning content that is available to people and not necessarily the content that is actually seen by them. Although this is a limitation that affects all scholarship on these topics, it is important to note that the difference between these two populations may be systematically biased by factors such as the time when individuals visit their feeds, the duration of their visits, and the algorithmic content ranking conducted by social platforms. A third key limitation is that since we relied on manually curated lists and verification for identifying distinctive curation actors (e.g., media organizations and OLs), we cannot guarantee the comprehensiveness of the lists. For example, the list of politicians does not include state and local politicians, which may have different levels of exposure and audiences.
There are also several avenues for future work to expand this research. In terms of theory, the curated flows framework puts much of its emphasis on the actor who is doing the curation. Our study shows that there is room to expand the theory to consider the producer of the content in addition to the person who curates it as it propagates through the network. Content attribution is also a major challenge that calls for methodological contributions. Furthermore, future research can examine how the different types of political content exposure are related to pro-democratic attitudinal measures known to be crucial for robust democratic functioning, such as political knowledge, and political efficacy. Future research can explore how exposure varies along other sociodemographic dimensions such as educational attainment or socioeconomic status. In addition, the current study paves the way for next-step experimental research that makes clear causal identification of how different types of political consumers engage in and mobilize to political action both online and offline.
Supplemental Material
sj-docx-1-hij-10.1177_19401612231213291 – Supplemental material for Who Is Curating My Political Feed? Characterizing Political Exposure of Registered U.S. Voters on Twitter
Supplemental material, sj-docx-1-hij-10.1177_19401612231213291 for Who Is Curating My Political Feed? Characterizing Political Exposure of Registered U.S. Voters on Twitter by Assaf Shamir, Jennifer Oser and Nir Grinberg in The International Journal of Press/Politics
Footnotes
Acknowledgements
We would like to acknowledge the helpful comments received from Diyi Liu, Eran Amsalem, Patrícia Rossini and participants of the 8th IJPP conference, the 2022 Zurich Digital Democracy workshop, and the 2022 American Political Science Association, Politicla Communication pre-conference. Other valuable colleagues contributed insightful comments including Alon Zoizner and Oren Tsur. Finally, we would like to thank the anonymous reviewers who helped crystalize and articulate more clearly the core contributions of this work.
Authors’ Note
The research was approved by BGU’s Departmental Ethics Review Board (approval #SISE-2022-32).
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Funded by the European Union (to Oser: ERC, PRD, project number 101077659); Ben-Gurion University Data Science Research Center’s Young Faculty Interdisciplinary Grant 2021–2022 (to Grinberg and Oser as co-PI’s); by Ben-Gurion University’s Interdisciplinary Fellowship for Outstanding Graduate Students 2021–2023 (to Shamir); by the Israel Science Foundation (to Oser, grant number 1246/20).
Data and Materials Availability
All data and code necessary to evaluate the final conclusions in the paper are available in a public GitHub repository at https://github.com/Socially-Embedded-Lab/pol-exp-curators and in Harvard Dataverse at https://doi.org/10.7910/DVN/OP9JQT. Although the Twitter panel was originally constructed by linking two public resources (public voter record and public Twitter activity), we do not release any personally identifying information as this may expose individuals to additional unanticipated risks, violate people’s privacy expectations (
), and breach Twitter’s Terms of Service.
Supplemental Material
Supplemental material for this article is available online.
Notes
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
