Abstract
Status is central to understanding collaborative behavior, yet it is often difficult to measure in cultural fields where perceived standings are only partially observable. This study develops a scalable supervised machine learning approach to infer directed deference in collaboration networks using a partially observed status hierarchy derived from a ritualized site of status conferral (a televised competition series). Drawing on a longitudinal “featuring” network of more than 3,000 South Korean hip-hop artists, we train a classifier to learn how differences in status-relevant characteristics map onto observed deference patterns and then use it to estimate preferential attachment across all collaboration dyads. The resulting measure aligns closely with external expert assessments of artists’ relative standing. Applying this metric to streaming performance data, we show that collaboration improves listener engagement and that its effect varies nonlinearly with status distance: artists benefit both from partnering with higher-status collaborators and from featuring emerging talents.
Introduction
Ever since Max Weber conceptualized status as originating from social evaluations, a fundamental sociological insight has been that actors are differently positioned in the social structure and that rewards are largely a function of status position (Gould 2002; Simmel 1950; Weber 1922). Status accumulates through acts of deference among market participants and is typically measured by an actor's affiliation with credited sources or third-party standings—a perceived social construct correlated with but distinct from intrinsic quality (Podolny 1993; Sauder, Lynn, and Podolny 2012). Empirical research has established the unique role of status in shaping perceptions of the actor and thus the flow of payments, resources, and opportunities available to them (Benjamin and Podolny 1999; Ertug and Castellucci 2013; Kovács and Sharkey 2014; Podolny 1993; Podolny and Phillips 1996; Rossman, Esparza, and Bonacich 2010; Rossman and Schilke 2014; Stuart 1998). The basic argument is that status functions as a heuristic for audiences to assess quickly the quality of products and behaviors and as a warranty for actors to reduce coordination costs in forming exchange relations with other partners.
Yet, status remains difficult to measure, particularly in fields without solid formal rankings, affiliations, or award systems. In such contexts, researchers often lack a full sampling frame for perceived relative standings, making it difficult to observe status hierarchies and identify deference. This article addresses a core methodological challenge: when status is partially observed, how can researchers quantify directed deference among collaborators? Work in the sociology of science, for example, has shown that junior scholars benefit from the prestige of senior collaborators or advisors through co-authorship and academic lineage (Azoulay, Graff Zivin, and Wang 2010; Li and Agha 2015; Maliniak, Powers, and Walter 2013; Mcdowell and Smith 1992; Merton 1968; Zuckerman 1977), demonstrating that status can be relationally transmitted and strategically leveraged. However, these studies typically rely on well-documented institutional hierarchies and formalized career paths. In contrast, cultural producers operate in informal networks where collaboration itself may signal deference, endorsement, or aspiration, but where status positions are harder to observe.
We propose a supervised machine learning (ML) strategy for recovering status hierarchies and measuring directed deference in a collaboration network. Our empirical setting is the South Korean hip-hop scene, where collaboration—typically in the form of “featuring” other artists on tracks—is widespread, and artists’ performance is clearly indexed as collaborative or individual. These featuring ties often reflect implicit status judgments, as artists selectively invite collaborators whose perceived standing may enhance or complement their own. To capture these dynamics, we leverage a partially observed status hierarchy derived from a long-running televised hip-hop competition structure—a ritualized site of status conferral that has played a central role in shaping public recognition and career trajectories in the field. Even with a partial status hierarchy, ML can help learn actors’ differences in quality and credential covariates—strong correlates with status positions—across perceived (labeled) status levels. Once trained with balanced data, the model can be used to predict preferential attachment in featuring dyads whose characteristics were likewise observed, but status levels were not perceived.
Our research design, combined with fine-grained two-year longitudinal data from the country's largest music streaming platform, allows us to theorize preferential attachment in collaborative cultural production and test its performance consequences. We find that our preferential-attachment measure, scaled from status conferrals qualitatively observed in a competitive media format, closely corresponds to external expert ratings from rappers, critics, and fans collected through a status position survey. Using this novel metric, this study tests whether collaboration enhances artists’ ability to attract user likes and how this effect varies with the degree of preferential attachment. Contrary to the usual expectation that artists benefit from piggybacking a higher-status collaborator's fame, we find a U-shaped effect of preferential attachment on performance: not only featuring a higher-status artist but “picking up” emerging talents elicits positive listener ratings. Our scalable measurement approach—combining digital trace data, inference from influential media-based status conferrals, and collaboration network—offers a framework for revisiting the problems of organizational and cultural questions about strategic collaboration that have been constrained by traditional data limitations.
Theoretical Background
How Status Distinctions Have Been Identified in Research
Status is broadly understood as the position in a social hierarchy resulting from accumulated acts of deference (Goode 1978; Podolny and Phillips 1996; Sauder et al. 2012; Whyte 1943). In this respect, status can be understood as the stock corresponding to the flow of deference. Joel Podolny's status-based model of market competition argues that a producer's status shapes its competitive advantages and constraints (Podolny 1993). The model conceptualizes status as a market signal. That is, status reflects how market participants perceive a producer's offerings, with varying levels of deference attached to competitors, rather than an objective measure of product quality.
What are the acts of deference that give rise to status distinctions? Actors rely on affiliation and arbiters. First, actors acquire status by affiliating with credentials of status-valued resources. Associations and affiliations are observable indicators of attachment or “gestures of approval.” For instance, Benjamin and Podolny (1999) measure status by investigating how California wineries affiliate with appellations. By deferring to a particular appellation, wineries signal that their wine is of a quality consistent with the wine of other producers who also affiliate with that same appellation. A winery that continually affiliates with high-status regions will have higher status than a winery that persistently affiliates with low-status regions. They measure status ordering by Bonacich's centrality of cross-regional citations, which indicates an appellation's status position relative to that of other appellations. This cross-regional citation behavior is claimed to be correlated with, yet distinct from, wineries’ quality measured by expert ratings from magazines (Benjamin and Podolny 1999). Likewise, scholars have focused on affiliations manifested in exchange relations, including patent citations in the semiconductor industry (Podolny and Stuart 1995; Stuart 1998), painters’ reputations inferred by the centrality of art galleries in the co-exhibition network (Fraiberger et al. 2018), and PhD exchanges among academic departments (Burris 2004). Researchers often leverage status-ordered documents, such as investment banks in the tombstone advertisements of security offerings (Podolny 1993, 1994; Podolny and Phillips 1996), and star powers measured by centrality in the network of screen credits (Rossman et al. 2010).
Second, actors’ deference is often granted by third-party judgments. Arbiters making critique, review, or quality evaluations are usually outsiders to the status hierarchy of focal actors; for example, restaurants rely on Michelin, and universities on the QS ranking, and so on. An influential study by Rao (1994) examines how certification contests shape organizational status, demonstrating that third-party assessments—such as Moody's ratings for insurance firms, Michelin and AAA rankings for restaurants, and J.D. Power's evaluations of automobiles—play a pivotal role in shaping public perceptions. These external evaluations influence how organizations are positioned in status hierarchies, ultimately affecting their likelihood of survival in competitive markets. These forms of external arbiters often materialize deference by the conferral of awards. Scholars have captured the effect of status distinctions by examining prestigious book prizes (Kovács and Sharkey 2014), the Oscar awards in the film industry (Perretti and Negro 2006; Rossman et al. 2010; Rossman and Schilke 2014), Grammy awards and nominations (McMillan 2022; Negro, Kovács, and Carroll 2022), and awards of museums and galleries in the contemporary art field (Ertug et al. 2016).
We extend this line of research by addressing two lacunae. The first concerns empirical scope. How can we identify the acts of deference (and thereby status hierarchy) when sources of status are not fully available for all actors? When status is measured by affiliations, the empirical scope was limited to special cases where a “sampling frame” is known—typically, “written” directories or credentials from a population of organizations. But this is not often the case, particularly when we scale down the analytic level from organizations to individuals. Entrepreneurs do not necessarily register themselves on a list, and their behaviors seen as deference are relatively more implicit and unobservable. When status is measured by deference granted by arbiters, external evaluation systems, or award systems are inherently left-censored—selective towards successful actors, especially in a winner-take-all market. Further, award winners in popular culture are too few—the discrete classification of awardees and non-awardees might mask the gradation of status differences between actors, not to mention that the standards for awards sometimes decouple from standards of fame in the public (e.g., aesthetics vs. popularity).
The second research gap pertains to actors’ status-building processes by direct collaboration. While prior studies have documented how actors cite or affiliate with status-valued resources (Stuart 1998), and how team composition affects recognition (Rossman et al. 2010), these approaches often treat collaboration as a proxy for deference rather than a mechanism of status transfer in its own right. Yet collaboration can itself be a strategic act of deference—where the joint product signals endorsement, competence, or prestige. Co-authorship in academia exemplifies this: junior scholars gain visibility and credibility by working with elite mentors (Merton 1968; Zuckerman 1977), and their scholarly impact often depends on the status of their collaborators (Maliniak et al. 2013; Mcdowell and Smith 1992). The death of a prominent scientist has been shown to reduce the productivity and citation impact of junior collaborators (Azoulay et al. 2010), while co-authorship with high-status figures increases the likelihood of grant success and publication (Li and Agha 2015). These studies demonstrate that status can be relationally transmitted and strategically leveraged in institutionalized academic settings. However, less attention has been paid to how similar processes unfold in cultural fields where status is informal, fluid, and negotiated through collaborative outcomes. Our study extends this literature by examining how status hierarchies are enacted and reproduced through collaboration in contexts where formal markers are absent, and deference must be inferred from relational patterns.
An ML Application for Dyadic Deference in Collaboration
We propose applying supervised ML to address the problems of limited observability and collaboration as acts of deference (Figure 1). This research design presumes that researchers partially observed a status hierarchy from a representative arbiter and that an actor, i, invites another, j, to co-perform, moving across the ordered levels of status (Figure 1A). The key empirical focus rests on the direction of the deference dyad created by the invite on a co-product: upward or downward (Yi). Yi is straightforward for the actors, iL, whose status is labeled in the external arbiter, but is not straightforward for other actors, iU, whose status is not labeled that way. Without the ML research design, one might simply rely on observable proxies, such as quality indicators and differences between actors, and label iU that way. However, this approach would overlook the essence of status as social evaluations, which are somewhat distinct from quality and require a certain degree of consensus pronounced in the field.

Research design applying supervised machine learning to classify the direction of dyadic deference in collaboration from a partially observed status hierarchy.
A core innovation is to leverage supervised machine learning to recover the direction of deference of a collaboration tie for iU by learning how the direction of deference was determined as a function of the difference in quality and credential covariates,
One can use ML algorithms such as random forests to train and obtain the most accurate model,
The predicted deference in collaboration captures a new dimension of the behavioral effects of status. We use the term “preferential attachment” to conceptualize how status shapes deference in collaboration. Preferential attachment in network science suggests that the more connected a node is, the more likely it is to receive new links (Barabási and Albert 1999). For instance, when choosing between two webpages, one with twice as many links as the other, about twice as many people link to the more connected page. Likewise, actors are more likely to invite and collaborate with those positioned higher in the status system because of their visibility. High-status actors would often be chosen as referents with more connections with peers and are subsequently preferred by these peers to collaborate again—a relational process that leads to accumulated advantage or the Matthew effect (DiPrete and Eirich 2006; Merton 1968).
Our claim is that preferential attachment in collaboration creates advantages that are distinct from the advantages rooted in one's own status position. While status effect means that individual position occupied in the status hierarchy leads to differential returns on performance (Baum and Oliver 1992; Ertug and Castellucci 2013; Podolny 1993; Podolny, Stuart, and Hannan 1996; Rossman et al. 2010; Stuart 1998; Stuart and Ding 2006), preferential attachment effect suggests that gains pertaining to relative status in a collaboration tie lead to differential returns on performance. Even with the same status position and comparable opportunities to collaborate, the direction of deference (collaborator's relative status) can have different consequences on actors’ performance because audiences may perceive such deference patterns as signals enhancing or undermining upon the actor's status position. Ties can provide informational cues to audiences about the underlying quality of one or both of the market actors (Podolny 2001).
With regard to the preferential attachment effect, we test two contrasting hypotheses. The first hypothesis is that preferential attachment will positively affect the actor's rewards in the market (H1). A primary mechanism is signaling: ties to high-status actors can serve as signals of quality or identity, especially when such attributes are difficult to assess (Spence 1973). A lower-status actor engaging a more established one on his or her product may signal to audiences that the quality of the actor is good enough for the (relatively) higher-status actor to collaborate with willing endorsement. Such an “upward” tie may attract an existing market base for the invited actor, leading to greater returns for the focal actor. Thus, the farther an actor climbs up the status ladder by collaborating with a higher-status actor, the greater the advantages in performance. Contrarily, downward-status ties can potentially harm the reputation of the focal actor. Firm-level studies reported the fear of associating with less prominent firms (Podolny 2008) and its disadvantages found among university sports teams associated with lower-status teams (Washington and Zajac 2005).
The second hypothesis is that preferential attachment will manifest as a U-shaped relationship with the actor's rewards in the market (H2). A primary mechanism is visibility, which status distinctions potentially create. Social network research has long observed that similar actors tend to interact (Blau 1977; Homans 1950; McPherson, Smith-Lovin, and Cook 2001). Status homophily likely drives who collaborates with whom, as that helps reduce risks and costs, which makes the combination of high- and low-status actors unusual. How would this unusual combination be perceived by audiences? Note that, in H1, we characterize downward status ties as harmful to one's own status. However, it is also possible that audiences interpret the unusual collaboration as actors “picking up” newly entered collaborators whose quality is comparable but not sufficiently exposed to them. This can occur when parts of the field's status hierarchy are fluid rather than fixed, partly the reason why new star actors can sometimes emerge because audiences are open to a certain degree. Thus, H2 is different from H1 in that there will be advantages at both ends of the scale of preferential attachment. In other words, the collaboration will help the least when the direction of deference is ambiguous.
Empirical Context
We implement the ML deference classifier and test preferential attachment effect using a hip-hop collaboration network in South Korea. The nature of hip-hop collaboration provides a unique opportunity to assess the deference relationship in a co-product wherein actors invite or are invited by alters and its effect on audiences’ responses. Hip-hop artists do not write or perform music in social isolation. Instead, the genre's proliferation has come with both informal and formal networks of support, rivalry, and collaboration (Gienapp, Kruckenberg, and Burghardt 2021; Halgin, Borgatti, and Huang 2020; McMillan 2022; Smith 2006). Hip-hop artists often publicly feature others on their songs—it is quite common to see “feat.” in the song titles of hip-hop music tracks. For instance, Dr. Dre's full-length album “2001,” released in 1999, had its 21 out of the total 22 tracks feature other artists. The variety of the featuring rappers adds different tastes to songs in the same album, so much so that Dr. Dre increases the overall quality of the album and perhaps effectively holds listeners playing the large 22 tracks without boredom. Still, audiences clearly recognize that the cultural product (<2001>) belongs to an individual artist (Dr. Dre) and that he invited other artists to collaborate. It is also likely that audiences conject the social relationships between Dr. Dre and the featured artists.
The featuring network of hip-hop music raises interesting sociological questions. For young artists, the opportunity to feature on the famous Dr. Dre's album would mean making a name for themselves, thereby jumpstarting one's career, and rising to commercial prominence. Dr. Dre might have involved a more established artist in his tracks as acts of deference and perhaps have performed better than otherwise sang alone. Artists with no fame, like Dr. Dre, would try to release tracks that feature already-established artists—preferential attachment—and lace a “hit.” All these speculations involve an implicit status hierarchy that actors in the field commonly have in mind. Our theoretical hypothesis above now turns into an empirical question: How do artists’ status positions shape who features whom in collaboration, and how does preferential attachment in collaboration shape differences in audience reception?
While it is always challenging for researchers to identify hierarchical status positions, the hip-hop scene of Korea 1 provides an analytical merit to undertake that task. We leverage the archive of a widely recognized TV audition program series (“Show Me The Money,” SMTM hereafter). A competition reality show open to the public (on average 25K applicants), SMTM has grown in popularity since the first season aired in 2012 to the eleventh season in 2022, and it is credited for increasing the South Korean public's interest in hip-hop. The qualifiers of an SMTM are given opportunities to work with their team producers/judges and release songs, many of which hit widely with a “chart-in.” Most independent hip-hop artists deem SMTM a primary entry to the music market. The program's judges in each season (about eight artists) are carefully selected and recruited by the program's staff, based on the fame and deference that fans and fellow artists have for artists. The judges hold utter authority to qualify applicants and produce individual or collaborative tracks for them. The qualifiers eventually make themselves a name for the public audience and typically live with that honor down the road.
While the format of SMTM slightly varies from season to season, contestants on the show compete in different challenges through several knock-out rounds until only one rapper is left—the winner is awarded ₩100,000,000 (≈ $72,487). A season of SMTM begins by announcing (typically) four teams of multiple “producers” cast by the MNET broadcasting team (Figure 2A). The producers are experienced, popular rappers who already achieved significant success and are thus able to assess and offer advice to participants. They play an “examiner” role in screening out thousands of participants performing a cappella in the first round held in a large indoor arena. They then exercise utter authority to advance or eliminate contestants based on solo rap performances or rap battles in the second and third rounds.

Competition settings of Show Me The Money (SMTM).
To help understand the competition structure, we provide some scenes of the solo-rap contest in Figure 2B and C. A contestant (the one wearing a white shirt in the middle of the circled stage) raps with a self-chosen beat in a one-minute span. The screen below the producers’ seating initially displays “PASS” but can turn to “FAIL” if they decide to reject the contestant. If the contestant ends up finishing it with at least one PASS, then he or she advances to the next round. It partly resembles a journal review process in which the paper under review is rejected if, in this case, all four reviewers give a “fail.” As soon as the four FAIL signs pop up, the beat stops immediately, and the unsuccessful contestant will be surrounded by flames (while rapping) and pulled down to the bottom floor (Figure 2C).
The setting of the program is sociologically interesting as it appears to be a spatial (vertical) representation of status-generating or status-reinforcing processes. Indeed, contestants often express respect and honor to producers while answering questions before performing, although such talks might not factor in for the judgment of producers (they want to be objective and opt out when the contestant belongs to the same crew/label as them). We contend that actors and audiences witnessing all these processes may perceive and embody a status hierarchy between producers (judges), qualifiers, and the unsuccessful.
Team selection is a central feature of the show. Producers recruit team members from participants who have passed multiple preliminary competition rounds, including the aforementioned one-minute solo rapping contest. This stage of team selection often serves as a ritual where judges give deference to participants—sometimes even competing with one another and pleading to secure them for their teams. In this way, selected participants are clearly distinguished from other unqualified ones. Selected team members enjoy several privileges, such as access to catchy beats, top-tier production for their SMTM tracks on their competition stages onward, and performance opportunities with extensive media coverage throughout the competition. Consequently, we define the Qualifier as the subgroup of participants chosen to join a producer's team.
The Qualifiers then start producing songs under the guidance of or in collaboration with their producers. The quality of songs and artists’ live performances—assessed by a diverse set of judges including an “external” group of rappers, university hip-hop associations, or general public audiences—are a major factor in advancing to further stages until the final stage. These songs are released on the music streaming platforms “on the fly” as a season of SMTM progresses. 2 The popular show attracts much larger viewers than the usual fan base of hip-hop music; therefore, the Qualifiers can make a name for the public and enjoy (otherwise impossible) high returns with some songs that hit widely with a “chart-in.” That is the main incentive for many “underground” rappers to deem the show as the field's standard (recognition from rappers with authority) and participate in the show.
We conceptualize SMTM's competition structure as a site of ritualized status conferral, where symbolic acts of deference, judgment, and selection publicly enact and reinforce hierarchical distinctions in the Korean hip-hop field. Drawing on Goffman's notion of interaction rituals (Goffman 1967) and Collins’ theory of ritual chains (Collins 2004), we treat these performances not as isolated gestures but as components of a patterned social process through which recognition, symbolic capital, and emotional energy accumulate. As Collins writes, “successful rituals generate emotional energy in the form of confidence, enthusiasm, and initiative” (Collins 2004:48)—repeated symbolic interactions such as public praise, deference, and performance validation contribute to the visible concentration of status among contestants and producers. The flaming descent of rejected contestants, the vertical spatial arrangement of producers and performers, and the emotionally charged team selection scenes all serve as ritual mechanisms that encode and reproduce status hierarchies. These field-native status signals offer a culturally grounded alternative to conventional status proxies, enabling scalable inference while remaining embedded in the symbolic logic of the field.
The decade-long history of SMTM is then a partially observed but credible arbiter for us—the Judges, Qualifiers, and Unsuccessful participants of SMTM together comprise the three levels of a status hierarchy. With its advantages over other short-lived spin-off competition programs (see Table S1 in Supplemental Material 1.1), the SMTM archive will serve as a main status source for us to implement the ML workflow for deference in collaboration.
Recovering a stratified deference structure with the use of the field's major competition allows us to consider both status and quality at scale. There are a handful of sociological studies that examined status and collaboration in the hip-hop industry. But their status measurements rely on past demonstrations of quality, such as historical album sales (Halgin et al. 2020) or rare achievements like a Grammy (McMillan 2022). These studies are also limited in the coverage of samples, focusing on a subset of rappers who have released “diss songs” (Halgin et al. 2020) or elite rappers who made Billboard's year-end “Hot 100” R&B/hip hop charts (McMillan 2022). This article advances this body of research by leveraging digital trace data (a music streaming platform) to collect longitudinally the performance of songs produced by a population of hip-hop artists and thus examine status dynamics at positions and career stages as general as possible.
Methods
This study devises a novel research design that recovers deference relationships in a collaboration network of individual cultural producers. Achieving this aim is relevant for many popular culture fields since status determinants conventionally used (organizational affiliation and awards) are typically either unobserved or established only for elites in the winner-take-all market (thereby, most of the actors are “unlabeled”).
Data Sources for Artists’ Quality and Performance
The main data source will be Korean hip-hop artists’ longitudinal performances and the metadata of their songs from the largest music platform in Korea—Melon (https://www.melon.com). As a South Korean online music store and music streaming service introduced in 2004, Melon defeated Apple Music, Spotify, and YouTube Music to become the most famous music platform in Korea and achieved success with over 28 million users. Melon offers a tab in which users can navigate new domestic releases by the album's genres (ballad, dance, rap/hip-hop, R&B/soul, indie, rock/metal, and trot). The genre is chosen by the artist. To include the population of active, current hip-hop artists as completely as possible beyond a paucity of elites, we first created a list of Korean hip-hop artists who have released any song in the rap/hip-hop tab in the last two years (5,862 artists). In the next two-year period from mid-2021 to mid-2023, we conducted weekly web-scraping on these hip-hop artists’ Melon pages that include past and updated information about their released songs. Our focal dataset includes 3,694 artists who have released at least one new song since the beginning of our data collection and thus are eligible for analysis.
The main empirical interest of our study lies in examining how artists of varying quality perform in terms of audience recognition on the Melon streaming platform. The platform's core indicator of audience recognition is the frequency that listeners liked one's song(s): the “like” counts 3 displayed for each song on Melon. The like button is only clickable one time (or cancel) by logged-in users (who have verified their identity). 4 Our weekly scraping tracked down the like counts of both “old” and “new” songs released before and after the beginning of our study. We use this indicator to measure artists’ performance and quality. Our main outcome, artist-level performance, is constructed by the like counts over the new songs (released during our two-year study period) using different time windows since released. Fixing time windows enables evaluating the recognition of songs produced by artists of varying quality within the same period. This strategy also helps us address the irregular release schedules of artists during the two-year study period (e.g., the as-of-now performance of a song released just a week ago is not comparable to that of the another song released a year ago) and consider the fact that listeners’ attention to released songs decays over time.
While performance corresponds to the extent of immediate success of an artist's current cultural product (i.e., current recognition flow), we deem artist's quality as past recognition stock, assuming that people cognitively retrieve an artist's track record cumulatively when evaluating their overall caliber. We measure quality as the cumulative like counts that the artist received for all the old songs with no early bound before the study period. Benjamin and Podolny (1999) similarly measured the winery's quality using the past performance of the wines (ratings). In the “Streaming” era, metrics from platforms have become key indicators of not just an artist's reach but also their creative quality, signifying how deeply an artist's work resonates with audiences. For example, magazines or news articles often highlight Spotify streaming counts to emphasize an artist's success or cultural impact. A notable instance is when publications reported on Billie Eilish's rise to fame, frequently citing her Spotify milestones (of course, “cumulative” ones), such as surpassing billions of streams for her hit “bad guy.” 5
We also collected the metadata of more than 40,000 albums and that of more than 200,000 songs from Melon and K Hip Hop Wiki (https://khiphop.fandom.com/). The meta information includes active years, the total number of released songs and albums, crew membership, label information, award history, and SMTM participation. The K Hip Hop Wiki provides rich information regarding the history of SMTM participation. We supplement this information by hand-coding upon the information extensively recorded on the SMTM pages of a Korean wiki source (Namuwiki 6 ). In Melon, artists and songs are well indexed by a unique identifier that Melon assigns. Those identifiers are publicly accessible by researchers. This indexed nature allows us to complete the key task of identifying collaborated songs and, precisely, the connection of musicians by the “featuring” behavior.
An ML Classifier for Dyadic Deference in Collaboration
A key methodological challenge is to tell the direction of deference among artists who collaborate on a song. As illustrated above, we devise an ML classifier that learns from an observed arbiter of the field's status hierarchies and predicts individuals’ status differences in the population's collaboration dyads. That is, given a collaboration dyad of artists whose deference direction cutting across perceived status levels of the competition show is known (labeled), one can train a model to learn the extent to which differences in quality and credential characteristics in the dyad should exist to classify it as upward or downward. This model can then be used to predict deference direction for collaboration dyads whose quality and credential characteristics were likewise observed, but status levels were not perceived (unlabeled).
Figure 3 presents the overall architecture of deference predictions between labeled and unlabeled data. The supervised ML design begins with the construction of pairs (i, j) of artists from collaborative songs, alongside their individual quality covariates, as the primary input. Deference direction (outcome labels/target outputs Yi for Dl) is labeled based on the ternary status levels of a decade-long archive of the annual competition show (SMTM). Model training utilizes a random forest algorithm to learn the labeled deference direction, Yi, as a function of the dyadic covariate difference, Ci,j. The primary output is the predicted deference on the unlabeled artist pairs, pi,k. Below, we describe these procedures in detail.

The supervised Machine Learning (ML) architecture.
Given a featured artist j in Dl in i's song k songi,k∈Songi, the combination of status elements by statusi and statusj assigns a discrete indicator of i's direction of deference
We label the former as preferential attachment (PA) and the latter as pick-up (PU). The collaborated songs across equivalent status categories (judge, judge), (qualifier, qualifier), and (unsuccessful, unsuccessful) are not treated as “status homophily” and excluded from the training set because whether statusi are statusj are equivalent was not a part of the pronounced judgments of the external arbiter (SMTM)—resulting in a total of 777 songs (npa = 457, npu = 320) in Dl. Labeling equivalent deference would reduce accuracy and external validity (see additional reasoning and performance analysis in Supplement Material 1.5). The unlabeled data Du is the complement set of Dl—24,009 collaborative songs in which at least one of the constituent artists was not recognizable in terms of the field's status standing (i.e., non-SMTM participants). We offer a concrete example of how we label collaborative songs according to status levels identified in SMTM archives in Figure S1 in Supplement Material 1.2.
We do not further differentiate status levels within the qualifiers (recruited SMTM team members). Given the program's structure, competition rounds after team formation often rely on non-individual assessments—typically team versus team—driven by entertainment needs. As a result, evaluating individual distinctions based on team performance may not be entirely accurate. While ultimate season winners across eleven seasons could be considered distinct from the rest of the qualifiers, they form a very small subset of six; some winners later served as Judges in subsequent seasons and are already categorized as such in our data. Consequently, even if the additional “Winner” level were introduced, only a small number of collaboration dyads would fall into that category, resulting in minimal impact. More substantively, winners and the rest of the qualifiers show only marginal differences in productivity, quality, and overall performance. This may reflect the fact that, while winners enjoy material and symbolic rewards in the moment, their overall airtime in the program is not significantly greater, given that the competition features many participants throughout its entirety. During an expert interview with a former SMTM star (Hanhae), he confessed that surviving until the team selection stage and releasing a track is the goal for many rappers, and that there is not much difference between achieving this and going further to be the ultimate winner of a season (see Supplemental Material 1.4).
Quality (past cumulative like counts): Quality covariate is the cumulative like counts that the artist received for all the old songs a week before a focal new song of an artist was released. Again, our assumption is that, in popular culture, a musician's quality is determined by the degree to which their songs were recognized and accepted by listeners. Active years: Career years since the release of his or her first song. Productivity: The number of new songs and albums released in the study period. Award history: A discrete covariate indicating whether the artist has won an award. Recognized crew membership: A discrete covariate indicating whether the artist has ever belonged to one or more crews recognized in K Hip Hop Wiki. Recognized label/distributor company membership: A discrete covariate indicating whether the artist has ever been part of a label/distributor company recognized in K Hip Hop Wiki. Top 100 songs: The number of songs that have appeared in the weekly Melon Top 100 chart since 2015, retrieved from a digital archive (https://guyso.me/).
We want to be assured that the hierarchical representation we derive from a series of SMTMs indeed reflects the difference in artists’ characteristics indicated by these covariates. In Figure S2 in Supplemental Material 1.3, we present the distribution of the covariates for 405 artists who were judges, qualifiers, and unsuccessful participants in SMTM. Generally, artists in higher status positions seem to be longer in active years in the field, well-received by listeners, more productive, and more achieving regarding awards.
Machine learning requires that the labeled data is identically distributed with the unlabeled data. As such, an important assumption of our ML application is that the characteristics of collaborations between artists who produced the collaboration songs in the labeled data are not different from those of the unlabeled data. The core quantity of our study is the dyadic covariate difference ci,j = f(Xi, Xj) between artists i and j. Similarly to prior work (Huang et al. 2007; Pan et al. 2011), our feature engineering focused on seeking balanced scales among 24 different kinds of features that compute the absolute and relative differences and other necessary factors. In the end, we selected the ten most balanced features that helped us achieve the highest predictive accuracy (see formulas and scaling details in Supplemental Material 1.6). We discuss covariate balance later in the Results section.
When more than one artist was invited to feature (e.g., Main Artist (i) – Song Title [feat. Artist A, Artist B, Artist C]), we create multiple dyads between i and artists A, B, and C. Then, we label Yi (SMTM collaboration dyads) or predict the degree of preferential attachment,
Outcome Distributions for Supervised Learning.
Abbreviations: PA = preferential attachment; PU = pick-up.
This supervised learning procedure is implemented with a random forest algorithm.
We apply one-hot encoding to represent the dyadic differences in covariates when the focal covariate is discrete. These covariates include award history, prestigious crew membership, and prestigious label/management membership. To prevent overfitting, we tuned the maximum depth of the random forest classifier using a grid search over depths 5–8. Depth 7 yielded the highest accuracy and F1 score on the test set, indicating optimal generalization. We summarize the parameters used in our random forest algorithm in Table 2.
Parameters Used in a Random Forest Algorithm.
To help understand the resulting outputs, we provide a concrete example of an artist's
An Example of Preferential Attachment (PA) Scores Estimated for an Artist.
Our weekly web-scraping produced time-stamped data, which helped us use the most updated information about artists when estimating
ML Performance
Prediction Performance From Naïve Classification, Supervised, and Unsupervised Machine Learning (ML) Algorithms.
Note: Naïve method refers to a crude classification by quality (past cumulative like counts) between artists i and j on a collaboration song dyad. For the K-means method, we cluster the training dataset into two groups by their standardized features and then decide the belonging of the samples in the test set by the distance between the samples and the center of the two groups. For the support vector machine (SVM) classifier, we used a radial basis function (RBF) kernel, which is well-suited for non-linear classification tasks and tends to generalize better than polynomial or sigmoid kernels in high-dimensional settings.
Covariate Importance in the Prediction of Deference on Collaboration Song Dyads and Covariate Balance Test Results of Dyadic Difference ci,j Between Labeled and Unlabeled Collaboration Pairs.
t-test for numeric features and a chi-squared independence test for categorical features. Difference = Mean or proportion (labeled) – Mean or proportion (unlabeled). For the sake of simplicity of the proportion test, we use the proportion of concordant pairs (e.g., the main artist and featuring artist being both or neither award winners).
b Beyond p-values indicating whether a difference exists, we evaluate the magnitude of the difference, using Hedge's g (a corrected version of Cohen's d to account for unequal sample sizes between two samples) for numeric features and Cramér's V for categorical features. Qualitatively judging magnitude is done with commonly referred thresholds (Hedge's g: .2 [small], .5 [medium], .8 [large]; Cramér's V: .1 [small], .3 [medium], .5 [large]).
Results
External Validation of the Preferential Attachment Measurement
Our ML model rates the extent of deference among a pair of artists that produced an unlabeled collaboration song (pi,k for songi,k, of Du), which then leads to our artist-level preferential attachment score, PAi. One should ask, How close is the ML-rated pi,k to the status differences that would have been rated by humans? We validate our key measurement by having field experts rate a sampled pairs of collaborators and assessing the similarity between expert responses and pi,k.
We used a stratified random sampling scheme encompassing four levels of quality strata to select 100 balanced and recognizable collaboration dyads in Du (Table S4 and detailed description in Supplemental Material 2.1). The resulting PA scores of the sampled collaboration pairs were evenly distributed, as shown in Figure S4, Supplemental Material, allowing for external raters to assess a wide variety of deference cases. Then, we developed a “status position survey” (Figure S5, Supplemental Material). This survey presented the names of two artists (actual collaborators on a song) and posed the question: “Which of the two artists do you think is more popular than the other?” Respondents could choose one of the following answers: (i) Artist A is much more popular (coded as 1), (ii) Artist A is a bit more popular than Artist B (coded as 0.5), (iii) they are similar in popularity (coded as 0), (iv) Artist B is a bit more popular than Artist A (coded as −0.5), (v) Artist B is much more popular (coded as −1), or (vi) I don’t know either rapper (coded as missingness). This external validation survey was filled by experts, including two established rappers (Deepflow and Hanhae), as a music columnist (Bong Hyun Kim), and three avid fans representing different generations, and a large language model (GPT-4o). On one occasion, we restricted GPT-4o's ability to search for information online by using an AI tool called Monica. We offer rater information in Supplemental Material 2.1.
We observe a high level of reliability in field expert ratings, alongside a strong correlation between the outputs of our ML model and expert responses (see Table 6). As the response scale is numeric and ordinal in nature (but with meaningful intervals), we employed the intraclass correlation coefficient (ICC) for assessing inter-rater reliability. The overall ICC for “human” ratings was .798 [.723, .864], while ICC values among field professionals and fans were .820 [.748, .877] and .754 [.646, .839], respectively—all exceeding the conventional threshold of .750 (95% confidence intervals in brackets). It suggests that there exists a status hierarchy among sampled artists that expert raters agree upon. Turning to the correlation between our ML model output and expert responses, we identified a striking level of agreement in the evaluation of artist status comparisons. With the exception of the ratings generated by network-disabled GPT-4o, all correlation values exceeded .600. These findings demonstrate that our ML measure of preferential attachment between collaborators closely mirrors the perceptions of rappers, critics, and fans, thereby providing robust evidence for its external validity.
External Validation Results: Inter-Rater Reliability (ICC) and Correlation Between Expert Ratings on Dyadic Deference and ML-Rated Preferential Attachment.
Featuring Behavior and Status Advantage in Hip-Hop Music Collaboration
Descriptive Statistics for Focal Artists.
Abbreviations: SMTM = Show Me The Money; PA = preferential attachment; PU = pick-up.
Who features whom on a hip-hop collaboration song? Figure 4 provides a descriptive comparison of characteristics between the main artist (under which a song was released) and a collaborating artist (“featured” in the title of a song) on a song dyad. For active years (the first panel), we have found that collaboration is more general among younger artists—a young artist featuring another young artist (1–3 years of career) seems to be the most prevalent dyad type. We found slightly heterogeneous patterns for quality and productivity. Unlike the plot of active years, we observed roughly four distinct “peaks” in the contour plots. It means that assortative mixing is not always the case for the featuring dyads: a hip-hop artist also invites another one whose quality and productivity are different from one's own (off-diagonal peaks).

The distribution of the characteristics of main artists and their featured artists on songs.
Our goal here is to examine whether collaboration helps increase artists’ performance in receiving users’ likes on new songs and, if so, how this collaboration effect on performance varies by the extent of preferential attachment in collaboration. We begin by modeling the descriptive quantity, Pi, using conventional ordinary least squares (OLS) regression in a blockwise manner. This approach helps understand the functioning of control variables and establishes a baseline for how collaborative contexts explain additional variation in artists’ performance.
There are five models (Table 8). Model 1 includes control variables, including artists’ individual quality (past cumulative recognition: logged sum of like counts received from users on their old songs before the study period), affiliation, and productivity patterns. As expected, artists’ acceptance expressed by listeners’ likes exhibits a strong consistency with listeners’ prior ratings. In line with extant research in other contexts, hip-hop artists with previous awards or membership in crews and labels that are known enough to be indexed at K Hip Hop Wiki secured more likes than those without such affiliations or third-party evaluations. One slightly counterintuitive finding is that artists’ tenure (active years) is negatively associated with performance. This may reflect the fact that the music-streaming market is largely driven by young consumers who are generally in favor of young musicians. These baseline findings stand while we account for the individual quantity of products (the number of released songs and albums)—a sort of “denominator” of our dependent variable.
OLS Regression Models Predicting Artists’ Performance by Collaboration and Preferential Attachment in Featuring Behavior.
Note: The dependent variable is the logged 8-week user engagement (Melon likes) for newly released songs in the study period. For observations with valid PA values (i.e., those who had a collaborative song), the variance inflation factor values for all independent variables, when mean-centered, are all under 10. The condition number, when scaled, is 21.36, which is below the rule-of-thumb threshold of 30.
Abbreviations: SMTM = Show Me The Money; PA = preferential attachment; PU = pick-up.
p < .001; ** p < .01; * p < .05.
In Model 2, we investigate the association between collaboration and artists’ performance. Releasing songs that feature another artist(s)—the coefficient for the dichotomous variable of the release of collaborative songs—makes a significantly positive difference of artists’ performance in receiving users’ likes on new songs. Yet the factual outcome Pi does not provide information about what an artist's performance would have been in the absence of collaboration. To ask a causal question of whether collaboration helps increase artists’ performance, we define a causal unit-specific quantity (Lundberg, Johnson, and Stewart 2021), Pi(t), as the difference in the potential like counts each artist would realize if collaborated a new song—assignment to treatment value t—in the study period—denoted Pi(1)—versus they did not—denoted Pi(0) among n Korean hip-hop artists:
In this counterfactual framework (Imbens and Rubin 2015), we define two potential outcomes for each artist: one if they have a collaboration song (treatment) and one if they don't (control) in the study period. These outcomes express the average treatment effect (ATE) of collaboration on performance (like counts), which is the difference in expected outcomes between the treatment and control groups. This theoretical estimand indicates the average difference in the potential outcome each artist i would realize if i had collaborated versus not. However, since we only observe one of these outcomes for each musician, the unobserved outcome needs to be estimated.
Our empirical estimand, Pi(t), is derived by an imputation estimator (Rubin 1974) that calculates the average difference in the OLS regression prediction if we recode i as a collaborator versus as a non-collaborator in the study period, while holding other covariates constant. This strategy helps estimate the causal effect of collaboration on performance while controlling for observed confounders. The results suggest that the ATE of collaboration remains significantly greater than zero, as displayed in Figure 5 (left panel). More substantively, the magnitude of the ATE estimate (.754) indicates that, holding all else equal, artists who released a collaboration song received approximately e.754 ≈ 2.13 times more likes on their songs than those who did not collaborate during the study window. This reflects a 113 percent increase in positive audience engagement, a substantial uplift in the context of streaming platforms.

Average treatment effects of releasing collaborated song(s) on performance.
Does the treatment effect of releasing a collaborative song vary across artists of different levels of quality (proxied by past recognition stock)? Within the potential outcomes framework, we employ a non-parametric stratification approach (i.e., without any assumptions about the functional form) to obtain Pi(t). This approach involves dividing the data into strata based on quality levels, estimating the treatment effect separately within each stratum, and aggregating these estimates weighted by stratum sizes to derive the overall effect. By eschewing parametric assumptions, the stratification estimator provides a more robust estimate of the causal effect.
The stratification estimates reveal that the performance return from releasing collaborative songs does not substantially differ across artists with varying levels of past quality stock (Figure 5, right panel). The conditional positive means of the collaboration effect are consistent across quality strata. Interestingly, the variability of the treatment effect tends to be greater among high-quality artists: there exhibit more extreme deviations, achieving either substantially higher or lower returns from collaboration relative to other strata. We speculate that this variability may reflect the influence of a stronger individual reputation.
Having descriptively and causally established that featuring behavior yields positive performance returns for Korean hip-hop artists, we turn to the question: Does the status difference between collaborators matter? In Models 3 to 5 in the main regression (Table 8), we explore our preferential attachment hypothesis by adding the variables capturing the average tendency of artists (average PAi) and its squared term and the number of songs classified as either PAi (PA ≥ .75) or PUi (PA ≤ .25) by our random forest ML algorithm, which are separately (Models 3 and 4) and jointly (Model 5) included. Both the linear and squared terms of PAi are statistically significant from zero, indicating a non-linear effect of the average tendency of preferential attachment on performance. 11 To more accurately capture this non-linear pattern, we also employ a more flexible machine-learning-based generalized additive model (GAM), which fits a smooth term for preferential attachment using cubic splines (edf = 7.63; F-statistic = 16.45; p-value <2 × 10−16).
We visualize the effect of preferential attachment through predicted curves from the OLS (red) and GAM (blue) estimates (see Figure 6). Both models suggest that higher like counts occur at either end of the PAi scale, indicating a U-shaped relationship between preferential attachment and an artist's performance. This finding supports our H2: status differences are helpful even when one co-produces a song with a relatively lower-status artist, whereas collaboration may yield the least benefit when the status difference between collaborators is ambiguous. This result is robust against the removal of a subset of high-quality artists who might have had higher baseline chances to work with a lower-status artist and happened to contribute higher returns on performance (see Figure S7 and a discussion in Supplemental Material 2.2). The GLM adds further nuance by revealing an asymmetric effect: a steeper uptick in performance in downward ties (the low average PA observations), relative to upward ties. This suggests that audience curiosity sparked by downward deference may generate stronger engagement than status signaling from upward deference.

U-shaped effect of preferential attachment on performance.
Population-Level Preferential Attachment in the Featuring Behavior
Our descriptive and explanatory model findings thus far suggest that (i) co-releases of songs among hip-hop artists are general; (ii) preferential attachment is prevalent in collaboration; and (iii) featuring behavior and status differences in collaboration create advantages in receiving more likes from listeners on a streaming platform. If so, do artists increasingly engage a higher-status artist on their songs over time, leading to a stratification of featuring invites?
From a different angle, preferential attachment in operationalization denotes that the more connected a node is, the more likely it is to receive new ties (Barabási and Albert 1999). Receiving ties in a music collaboration network would mean an artist being invited to feature in a song. We therefore count the artist's featuring frequency to measure “featuring in-degrees” (Lee and Li 2025).
We first examine the total featuring indegree of our focal 3,694 artists. Being featured in another artist's song seems to be highly selective. About half of these hip-hop artists (2,040 [55.2%]) have never been featured in another artist's song. Further, even among the rest (1,654 artists) who were at least featured once on a fellow artist's tune, we have found that the distribution of featuring indegrees is highly skewed. Figure 7A presents the cumulative distribution function of artists’ featuring indegree. It shows that featuring behavior exhibits a scaling law of complex networks. We fit log-normal (blue) and power-law (red) curves and investigated the goodness of fit, respectively. The log-linear fits better the featuring indegree data as the bootstrap hypothesis test (Clauset, Shalizi, and Newman 2009) cannot reject the hypothesis (p-value: .361). So, the number of total featuring indegree appears to be distributed similarly with what research found follows a log-normal distribution, such as the income of 97 percent–99 percent of the population (Clementi and Gallegati 2005) and the citation counts to journal articles and patents (Sheridan and Onodera 2018). Therefore, who gets invited to co-produce songs is neither random nor uniform; instead, artists seem to associate preferentially with a few “star” artists whose quality, fame, or expertise may help promote their own songs.

The distribution and temporal pattern of featuring indegree (the number of times an artist features in a song released by a different artist).
Having examined the static picture of preferential attachment, we further investigate whether the inequality of featuring invites between individuals grows over time. We fixate on the subsets of artists that had debuted at a similar time point and count the featuring indegree in one-year intervals. This allows us to zoom in on the degree to which the popular gets more popular, and the unpopular gets more unpopular for years. Figure 7B illustrates the dynamic pattern of the release of featured songs by artists who debuted 3–4 years ago (left column) and 5–8 years ago (right column).
The upper panel in Figure 7B shows the propensity to have ever featured (i.e., a discrete indicator of whether an artist had at least one featured song over a one-year period) in a focal year T, and we have found that it radically depends on the previous year's featuring frequency. The early career artists active for 3–4 years who had 5 + featuring indegree at 2 years ago have almost 100 percent to be featured in one or more songs in the most recent year. For those with 1 featuring indegree, the propensity decreases to below 50 percent next year. The same tendency is observed for another set of artists who have survived a longer time in the field and have thus been relatively more active collaborators (5–8 years since debuted): The stratification of the featuring indegree at t–4 lasts materializes through the differential likelihood of getting featured again at t–3, t–2, and t–1.
The lower panel in Figure 7B shows the same time-wise analysis on a count term of featuring indegree, focusing on the degree to which non-isolates 12 with varying featuring indegree in the previous year turn popular or unpopular in getting invited next year. As expected, the Gini coefficients of featuring indegree for both subsets of artists increase over time (see Table S7, Supplemental Material 2.5). We can infer from the line charts that two-sided processes drive the increased inequality: (i) those who were relatively unpopular (i.e., low featuring indegree) “fade out” next year; and (ii) those who were highly popular tend not to increase the frequency of being featured but maintain a certain level of activities.
Qualitative Evidence: How Rappers Make Collaboration Decisions in Practice
Our empirical focus has been on status distinctions between hip-hop artists manifested in their ties connected by the featured songs. To validate our findings, we should ask about the data-generating process: Did hip-hop artists indeed coordinate a song by themselves, caring about status positions? With this question in mind, we conducted in-depth expert interviews with several “veteran” hip-hop artists. Deepflow, a 41-year old rapper who served as an SMTM judge and ranks among the highest in total featuring indegree in our data, described the communication channel through which artists reach out for featuring requests as follows: Usually, it's social media and personal contact. There are sometimes contacts through the label, but I don't think it's common among us. The reason is … at the end of the day, it is the rappers cookin’ it. But if companies’ talks took place before and we meet thereafter in the studio, which feels pretty unnatural and awkward—we can't advance to the next steps. Plus, it's easier to say no to the requests received by the label as that doesn’t hurt it personally.
We also asked him about how he makes a decision given the personally received invites to feature: Q: We assume you receive lots of invites, particularly from junior rappers. How do you handle them, and by what criteria do you accept the invites? A: It's quite common to receive DMs from non-acquaintances. It's like a mailbox full of unread messages. But I can’t do every song. I use the “preview” function for screening. If that's too amateur level, I just pass by them. But I open and try listening to the tune if I remember names on Soundcloud or something. It's like “hmm, I know this guy.” Then I reply and consider doing a feature.
Yet it is worth noting that, while status is crucial for initial screening, once shortlisted, a variety of factors appear to affect the decision to accept the featuring invitation. For example, the interviewee tended to view featuring as an opportunity to expand his style of rapping: I prefer to do styles that I haven't done before. That is, I usually have a certain role that people want me to rap in a song, just like film actors get cast in certain roles … like, I usually get cast in a lot of tracks where they want me to spit some serious hip-hop stuff over a really hard boom-bap beat. At a certain point, I kind of say no to that, like, I’ve done this too many times and I’m done talking about this. So, I tend to be more excited when it feels like something I haven't done before. When there's a friend that I’m so close to and I want to support … and they're planning an album anyway and they need my spot—that's their grand plan. I don't want to ruin that. You know, it's not my discography; it's their music anyway. Then it goes like I’ll be used as a session; so even if the beat isn't my favorite, I'll just give it a shot.
Discussion
Status remains difficult to measure because it depends on perceived relative standings among actors. This article developed a measurement strategy for capturing deference in collaborative settings where cultural producers operate within a status hierarchy. The main methodological challenge we identified in the literature was: how can researchers quantify status distances when no complete sampling of perceived standings exists? Our approach shows that supervised machine learning can help learn and recover actors’ differences in quality and credential characteristics—strong correlates of status positions—across partially perceived (labelled) status levels.
We applied this strategy to South Korean hip-hop collaborations by using a decade-long archive of an influential annual TV competition series as a partially observed status arbiter. This approach draws on qualitative insight into the ritualized displays of deference embedded in the show's competition-based hierarchy and translates that understanding into a scalable analytic framework for status inference. For collaboration pairs whose deference direction (PA or PU) across the series’ pronounced status levels was known, we trained a model to learn how differences in quality and credential characteristics map onto upward or downward deference. This model was then used to predict PA or PU for collaboration pairs whose characteristics were observed but whose status levels were not. The resulting probability measure aligned closely with external expert ratings collected through a status position survey.
We used this ML-based preferential attachment measure to test how artists’ collaboration and preferential attachment shape user engagement on South Korea's largest streaming platform. Using imputation and non-parametric stratification estimators in a potential outcomes framework, we estimated the average treatment effect of collaboration and found clear causal evidence that featuring behavior boosts performance. Preferential attachment produced a U-shaped effect: artists benefited both from featuring higher-status partners and from engaging emerging talents, while collaborations with ambiguous status differences yielded the weakest returns. Finally, our population-level analysis showed increasing stratification of featuring invitations over time.
These findings shed light on a new dimension of status processes in cultural fields: Preferential attachment in collaboration. This contrasts with prior research that mostly focused on one's own status positions (Baum and Oliver 1992; Ertug and Castellucci 2013; Podolny 1993; Podolny et al. 1996; Rossman et al. 2010; Stuart 1998; Stuart and Ding 2006). Their measurement strategies typically targeted how actors affiliate with status credentials and/or rely on standings from external arbiters. Our approach instead describes the processes in which actors directly climb up the status ladder by inviting other actors and producing a collaborative performance. Rossman and colleagues, among a very few exceptions, examined status-based peer effects in a collaborative context, linking team spillover (Hollywood actors/actresses having teamed up with a prior Oscar winner) to the likelihood of winning an Oscar award (Rossman et al. 2010). While the concept of team spillover captures only “accidental” upward ties with a high-status actor, 13 our work illuminates actors’ directed deference—a more “purposive” social engagement that can go in both upward and downward directions in the continuum of preferential attachment. Interestingly, our results add that downward ties can indeed be beneficial for musicians’ performance. This contrasts with prior firm-level studies reporting disadvantages associated with downward between-firm associations (Podolny 2008; Washington and Zajac 2005). We speculate that the U-shaped effect is a function of the impact which the unusual combination of status positions in the featuring behavior makes on audiences.
We contribute to status research by conceptualizing ritualized competitions as culturally embedded sources of status signals. We demonstrate that symbolic acts of deference—performed, witnessed, and socially recognized within a competitive media format—can serve as field-native indicators of hierarchical positioning. Our case of SMTM illustrates how structured interactions among producers, qualifiers, and unsuccessful contestants enact a publicly legible status order grounded in the cultural logic of Korean hip-hop. Integrating this qualitative insight with a scalable analytic strategy, we infer status hierarchies from ritualized performances collected in digital trace data. Strikingly, the resultant measures of status and preferential attachment closely align with external expert surveys, validating the interpretive power of our approach. This conceptualization expands the scope of status research by showing how ritual operates not merely as a symbolic expression, but as a patterned social process through which status is actively produced, negotiated, and recognized in collaborative cultural production (Collins 2004; Goffman 1967).
While our study focuses on the South Korean hip-hop scene, our ML-based status measurement strategy could be broadly extensible to other national contexts. In settings where full-field sampling is not feasible, researchers often focus on relatively elite or publicly visible producers, and status hierarchies within these samples remain analytically meaningful. This strategy can be adapted to such cases by leveraging domain-specific supervision signals—such as competition outcomes, streaming metrics, award recognitions, or performance billing orders—to infer status and model deference. For example, in the domain of hip-hop in the United States, televised competitions like The Rap Game and Rhythm + Flow offer structured rankings that could be used to train models of status ordering. In the Puerto Rican reggaeton scene, status may be inferred from collaborative music videos, venue lineups, and institutional recognition through award shows. More broadly, this approach offers a scalable alternative to studies that rely on sparse or binary status proxies, such as Grammy wins (McMillan 2022) or historical album sales (Halgin et al. 2020), by enabling richer modeling of status hierarchies through partial but interpretable signals embedded in digital trace data.
Beyond national contexts, our strategy can also be applied to other domains where multiple actors purposively co-produce a product labeled in a meaningful order of contributions. Academic co-authorship networks offer a natural parallel: first authors, secondary authors, and corresponding authors with different quality characteristics exert different status signals (Jalali, Introne, and Soundarajan 2023; Katz and Martin 1997; Lee and Bozeman 2005; Li, Liao, and Yen 2013; Moody 2004). The sociology of science has long examined how collaboration with high-status scholars shapes visibility, productivity, and career trajectories (Azoulay et al. 2010; Li and Agha 2015; Maliniak et al. 2013; Mcdowell and Smith 1992; Merton 1968; Zuckerman 1977). Our research can complement this tradition by offering a scalable strategy for inferring status hierarchies in settings where formal markers are incomplete or perceived status data are partially available.
Likewise, our workflow—which combines digital trace data (digitally recorded performance), supervised learning through media-based status identifiers (publicly mediated status indicators), and a collaboration network (collaborative products)—can be used to revisit the problems of organizational and cultural strategic collaboration. These questions have often been investigated with a limited data scope inherent in traditional research designs. In this respect, our study serves as yet another example of social science benefiting from machine learning that helps researchers amplify coding and inference (Lundberg et al. 2022; Molina and Garip 2019). That said, successful adaptation of supervised learning approaches depends on the availability and quality of labeled data, as well as the representativeness of the unlabeled population. Researchers should carefully assess whether supervision signals reflect meaningful status distinctions in the target context and whether collaboration data are sufficiently structured to support inference. In cases where labeled data are sparse or drawn from a different cultural domain, techniques such as domain adaptation may be necessary to mitigate distributional shifts and ensure model validity. This generalizability, when approached with such considerations in mind, opens new avenues for studying strategic collaboration and cultural stratification across diverse fields.
This study is not without limitations. First, the observational nature of our study limits the ability to establish a fully causal relationship between collaboration, preferential attachment, and performance advantages. Although we estimated a counterfactual intervention (collaboration) in addition to the repeated measurement design that controls for the artists’ characteristics before and after the study period, our findings remain constrained by observable factors. Unobserved potential confounders—such as marketing efforts, non-quality-oriented audience engagement (i.e., diss and controversy-driven attention), fan base overlap among collaborators, and external trends in the music industry—are not fully captured in our model. Therefore, the estimated ATE of collaboration and its variation according to presential attachment should be interpreted within the boundaries of our data collection and methodological approach. Second, future research can benefit from incorporating aesthetic measurements—for example, sound and lyrics characteristics (Berg 2022; Kim and Askin 2024; Negro et al. 2022; Nie 2021). Why does hip-hop music collaboration create advantages? We provided a status-based explanation, but it would also be fruitful to investigate if collaboration leads to qualitative innovations in sound and narratives. Third, while our study examined hierarchical status signals manifested in a televised competition program, future work could further explore how different forms of visibility—positive, negative, or absent—contribute to status in mediated performance settings. In our field settings, failure to advance further stages still comes as an advantage in terms of visibility (Andrejevic 2020; Gamson 1994; Marwick and boyd 2011). 14 It would be interesting to examine a setting where status merit from visibility is highly dependent upon its valence (i.e., negative portrayal as status demerit).
Collaboration has long become a general form of cultural production and entrepreneurial innovations. It is natural that actors are often incentivized to explicitly showcase the engaged parties to the eyes of audiences—the featuring behavior of hip-hop entrepreneurs is exactly the case. Our case study suggested one empirical research design to investigate status processes embedded in collaborative outcomes. We hope our work has paved the way for many more to come with the goal of enhancing an understanding of acts of deference manifested in collaboration.
Supplemental Material
sj-pdf-1-smr-10.1177_00491241261420812 - Supplemental material for A Machine Learning Approach to Preferential Attachment and Status Advantage in a Hip-Hop Collaboration Network
Supplemental material, sj-pdf-1-smr-10.1177_00491241261420812 for A Machine Learning Approach to Preferential Attachment and Status Advantage in a Hip-Hop Collaboration Network by Jaemin Lee and Yujie Li in Sociological Methods & Research
Footnotes
Acknowledgments
Authors contributed equally. We thank the SMR Associate Editor, Brandon Stewart, and the three anonymous reviewers for their constructive critique and helpful suggestions. Jeong Woo Ham and Eddy Park provided excellent research assistance. We are also indebted to field professionals—rappers Deepflow and Hanhae, cultural critic Bong-hyun Kim, and television producer Hak-min Kim—whose in-depth interviews, as well as their longstanding commitment to the culture of Korean hip-hop, shaped our understanding of the field and inspired the initial motivation for this study.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This work was supported by the Chinese University of Hong Kong.
Preregistration Statement
This study was not preregistered. The analyses, models, and validation procedures were developed iteratively during the research process, and no preregistration protocol exists for this project.
