Abstract
Merging 2005 to 2015 Internal Revenue Service, Social Security, and Census records, the authors calculate national average gender pay gaps for various population definitions and then decompose trends in the contribution of firm, occupation, and job segregation to these pay gaps, as well as the size of the average residual “within-job” pay gap. In general, observed segregation tends to explain about half of age, education, and hours of work adjusted gender pay gaps, but the other half remains within occupations in the same firm. Although between-firm pay gaps rose and within-job pay gaps declined through 2009, the authors find little decline in firm- or job-level gender pay gaps after 2009. The results indicate that to reduce gender pay gaps, public policy and employers should target gender disparities in hiring and job assignment as well as potential disparities in pay setting.
Although the U.S. gender pay gap declined rapidly from 1970 to the early 1990s, movement toward pay equality between men and women has since largely stalled (England, Levine, and Mishel 2020). At the current slow rate of pay convergence, policy advocates have estimated that it will take somewhere between 60 and more than 100 years for the United States to reach equal pay for men and women otherwise equivalent in terms of human capital and hours of work (Leisenring 2020; World Economic Forum 2019). Our estimates suggest that if the rate of change remains at the 2009 to 2015 trend level, convergence will never occur.
Current research using individual survey data estimates gender pay gaps net of hours worked, education and experience, occupation, and sometimes industry, treating the gender residual as an indicator of unobserved differences in employer and employee behavior, including employer bias (e.g., Blau and Kahn 2017; Foster et al. 2020). This research identifies gender differences in lifetime labor force participation, current hours worked, and occupational segregation as the most important observed sources of earnings disparities between men and women but also reveals large residual disparities.
The importance of workplace segregation as well as within workplace and within-job pay disparities are not observable with conventional survey data. Increasingly, social scientists have been able to access government generated administrative data from tax or social insurance programs and observe firm and workplace level earnings dynamics. For example, recent research using Internal Revenue Service (IRS) linked employee-employer data suggest that most of the rapid rise in U.S. earnings inequalities is a between firm phenomenon (Song et al. 2019). Although there is recent research exploring gender pay gaps within and between workplaces in other countries (Barth and Dale-Olsen 2009; Bassier 2019; Card, Cardoso, and Kline 2016), we lack recent estimates for the United States. There are 1980-era estimates using linked employer-employee data for particular subpopulations (Avent-Holt and Tomaskovic-Devey 2012; Groshen 1991; Petersen and Morgan 1995) and one for 1990 for which shared workplace is imputed rather than observed (Bayard et al. 2003). More recent analyses have focused on career mobility within and between workplaces, but have not produced workplace segregation or within-job pay gap estimates (e.g., Barth, Kerr, and Olivetti 2021).
Scholars and regulators have been particularly interested in identifying gender disparities within jobs in the same firm as this closely matches the definition of pay discrimination in the 1963 U.S. Equal Pay Act, which prohibited gender pay disparities within jobs in the same workplace not associated with seniority, merit or other reasonable productivity distinctions. Title VII of the 1964 U.S. Civil Rights Act identified segregation as equally prohibited, but the role of segregation produced by employer discrimination has been difficult to observe in the absence of firm- and job-level data, and even then because segregation is often jointly produced by the labor market decisions of both workers and employers.
During the Obama administration, there was a proposal by the U.S. Equal Employment Opportunity Commission to collect firm-level pay data. This initiative was opposed by segments of the business and legal communities, which argued, in part, that the data collection was not necessary because firms were already monitoring and addressing within-job gender pay disparities. This may not be an unreasonable claim, as the available, if quite old (circa 1980), estimates of within-job pay disparities from high-quality linked employer-employee data suggested an average within-job gender pay gap of only 3 percent or less (Groshen 1991; Petersen and Morgan 1995). If that pay gap has in the meantime narrowed further, then within-job gender pay gaps may be trivially small and increased federal data collection and regulatory efforts misplaced.
In this article we provide estimates of the average within-job gender earnings and hourly wage gaps for various definitions of the U.S. employed population and calculate the relative impact of segregation across firms, occupations, and detailed occupations within firms (our proxy for jobs) on these gaps. This exercise also produces estimates of the average pay gap between women and men working in the same “job” (i.e., three-digit occupation within the same firm). Our estimates suggest that half of the adjusted gender pay gap results from segregation at the job level and the remainder from within-job pay disparities. From a policy point of view these estimates support the need for both regulatory attention to segregation in hiring and job assignment, as well as within-job gender bias in pay practices.
We build on the work of the Comparative Organizational Inequalities Network, an interdisciplinary group of social scientists in multiple countries developing theory and methods for the analysis of linked employer-employee administrative data. The Comparative Organizational Inequalities Network has recently published a paper decomposing the gender earnings gap in 15 countries (Penner et al. 2023). The current article developed out of the U.S. estimates from that project and deepens those results with new information on multiple additional populations of formal economy workers and trends over time, while providing a much more extensive and dynamic interpretation. 1 These U.S. estimates are based on employer-employee administrative data from the IRS combined with individual-level gender and age information from Social Security Administration records and occupation, education, and hours worked responses to the U.S. Census Bureau’s American Community Survey (ACS). The key limitation of these data is that the ACS is a 1 percent population sample, and when matched to IRS workplaces produces a sample biased toward larger workplaces.
We make three primary scientific contributions. The first is to produce estimates of the aggregate impact of firm, occupation, and “job” segregation on national gender pay disparities. The second is to document that the average degree of within-job pay disparity in the United States is surprisingly high. We observe these processes from 2005 to 2015 and so cannot comment on more recent trends, but our estimates suggest that in the absence of regulatory intervention and changes in employer behavior, the intensity of gender pay disparities will not inevitably diminish. Third, we interrogate the quality of available U.S. linked employer-employee data for examining gender and other employment disparities and engage a recent National Academies of Sciences, Engineering, and Medicine (2022) report advocating improved workplace pay data collection.
We follow Petersen and Morgan (1995) and Smith-Doerr et al. (2019) in conceptualizing pay disparities as a result of gender differences in firm, occupation, and job segregation and within-job processes. The within-job pay gap can be interpreted in this framework as an upper-bound estimate of the average level of gender discrimination as defined under the 1963 Equal Pay Act. This is not, however, a strictly legal notion of discrimination, for which employer bias must be demonstrated, but rather a social science conceptualization of aggregate bias processes, only some of which might be illegal. Segregation components of the gender pay gap also may result from employer bias in hiring, job assignment, promotion, and firing, although the legal basis for establishing discrimination is ambiguous because there is ample room for self-selection into firms, occupations, and jobs.
We first observe national gender pay gaps after adjusting for human capital and hours worked, which we bracket as premarket distinctions, and then use an ordinary least squares estimation strategy in which sequential models add fixed effects for firm, detailed occupation, and their cross-classification, which we treat as a proxy for jobs. This produces estimates of the relative contribution of the three forms of segregation, plus within-job pay disparities, to the total national average gender pay gap. Our data are quite robust for estimating firm and occupation segregation, but job and within-job estimates are more fragile because of the low ACS sampling rate.
We find that women are more likely than men to be employed in low earning firms, occupations, and especially jobs. Occupational and firm gender segregation are roughly equivalent in their contribution to pay disparities, suggesting that focusing on occupation alone, which is common in the scientific literature, misses the important role of employers in producing gender disparities. Empirically, it is jobs—the intersection of detailed occupation and firm in our analyses—that are the most powerful segregation context, consistent with previous estimates (e.g., Petersen and Morgan 1995; Smith-Doerr et al. 2019). Consistent with recent research on rising between-firm earnings inequalities affecting workers generally (Song et al. 2019), the firm segregation component of the gender gap is strengthening over time. Finally, we find that about half of the gender pay gap for both yearly and hourly earnings is found within jobs and that this proportion is relatively stable over our observation period. Because of ACS sampling, these firm and job estimates primarily describe pay gaps in larger firms, a consideration we return to in the discussion.
Data and Methods
Linking Employer-Employee Data
In these analyses we use earnings and employer information for each individual’s employment spell(s) from IRS form W-2 covering tax years 2005 to 2015. As submitted to the IRS, the W-2 form contains both employee’s Social Security number (SSN) and an employer identification number (EIN). The EIN in most cases identifies a firm or a firm in a state (see discussion in Song et al. 2019). In the file available at the Census Bureau for research, personally identifying information such as SSNs and names are removed, with the Census Bureau assigning a unique, anonymous protected identification key (PIK) that enables linkages of records across data sources. Using PIKs, we match W-2 reports to the Social Security Administration’s 2016 Numerical Identification File, retrieving our measures of age and gender. Again using PIKs, we link individuals to their responses to the ACS, a 1 percent random sample of U.S. households. Importantly, some 99 percent of individuals in IRS records receive PIKs, while the ACS had a 94 percent PIK assignment rate, allowing a very high match rate between W-2 and ACS data.
In the matching process we first unduplicate EIN-PIK-year, taking the most recently dated form available. For individuals who work at multiple firms in a year, we focus on their highest earning W-2 report, selecting one at random in the very rare case of individuals with multiple equally well-compensated W-2s. We link individuals’ highest paid W-2 report to the concurrent ACS year; for example, W-2s from tax year 2015 are linked to respondents to the 2015 ACS. We were able to link 19.6 million total workers, yielding about 1.6 million to 2.0 million individuals per year, averaging very close to 1 percent per year of the W-2 earner population. In all linked data analyses, we use ACS sample weights. The median firm size among (weighted) matched respondents was 1,030, which was nearly identical to the median number of workers per EIN in the analytic administrative data set. The median number of workers per EIN and year linked to the ACS was 10.
There is some evidence that sample construction through this matching process may influence estimates relative to the universe of reported W-2 earnings. For example, fitting the same basic model to both the W-2-only and the W-2-to-ACS matched samples adjusting only for age, age squared, and indicators for full-time and marginal earnings yields gender gap estimates 4 to 6 percentage points higher in the ACS matched sample (in which women earn 22 percent to 27 percent less than similar men depending on year) than in the full W-2 sample (18 percent to 23 percent less). This most likely reflects that the matching process is more likely to be successful for workers in larger firms and that the earnings increment associated with larger firm size is larger for men than for women (Hollister 2004).
There are other limitations to these data. First, the employer information is at the firm level, rather than the workplace (i.e., establishment) level. We lacked access to geographic information from form W-2 and so were unable to further stratify firms by region or state. Thus, workplace variation in the gender earnings gap is not observed separately for multiple work sites in multiestablishment firms. We know from prior research that this is likely to be a small source of error (Tomaskovic-Devey et al. 2020). Second, for the ACS variables we have the normal measurement error associated with self-reported occupation and hours worked but also additional ambiguity in computing hourly earnings for multiple job holders (Kim and Tamborini 2014; Perales 2014; Speer 2016).
Measures
Yearly Earnings
Our earnings concept is all federal taxable earnings in a calendar year. We take box 1 from form W-2, which reports total annual taxable Social Security earnings for each individual at a particular EIN, including salary, wages, and bonuses, but excluding deferred compensation. We adjust to real earnings in 2015 prices using the Consumer Price Index for All Urban Consumers. Using administrative data on earnings has multiple advantages over conventional survey data. The first, and by far most important, is that we can create employer-employee data. This makes it possible to observe gender pay gaps at the firm and job levels, the level at which hiring and pay decisions occur and equal opportunity legal rights are defined. Second, the earnings data are of very high quality and do not suffer from the large levels of misreporting and missing data in self-reported earnings (Kim and Tamborini 2012). There is very little measurement error in the earnings measure.
Hourly Earnings
We calculate hourly earnings as yearly earnings divided by ACS-reported weeks and hours worked. In the ACS normal hours worked and weeks worked pertain to the previous 12 months. We multiply hours worked by weeks worked (using interval midpoints for weeks worked) to obtain an estimate of the total annual number of hours worked. We divide total W-2 earnings by annual hours worked to arrive at our estimate of hourly earnings in a typical week, excluding a small fraction of individuals with hourly earnings less than $1 or more than $100. This measure is error prone to the extent that the individual worked multiple jobs in the past year. Workers who hold multiple jobs average 12 total hours a week more than single job holders, who average 39.7 hours (Hirsch, Husain, and Winters 2016). Thus, average hourly earnings will be depressed by about 30 percent (12 divided by 39.7) using the ACS hours worked measure for multiple job holders. As only 5 percent of workers hold multiple jobs and they are 2 percent more likely to be women than men, this source of bias will reduce the gender earnings gap on average by a trivial .003 percent. In addition, we match last year’s hours to this year’s earnings, which will introduce error at the individual level but is unlikely to bias aggregate estimates one way or the other.
Marginal Jobs
Some observed job spells have very low earnings and are most likely to be held by young workers (Spletzer and Handwerker 2014). The W-2 reports are annual summaries, but include jobs of very short duration. Such jobs represent up to 30 percent of hires in any quarter, but there are no average gender differences in marginal job employment (Hyatt and Spletzer 2017). In addition, it seems likely that some of these marginal jobs are associated with the fraudulent use of SSNs by employers or workers (Abowd, McKinney, and Zhao 2018). In every model we include a control for marginal jobs, defined as those earning less than the equivalent of the federal minimum wage × 10 hours × 52 weeks. In the U.S. W-2 population, 14 percent of jobs are marginal by this definition.
Full-Time
We define individuals as working full-time if their total nominal W-2 earnings surpassed the equivalent of working the federal minimum wage in that year × 40 hours × 50 weeks (see Song et al. 2019 for a similar strategy using similar data). Analyses using ACS self-reported hours worked yield gender earnings gaps that are comparable with our estimates on the basis of these W-2 full-time earnings threshold. Individuals whose W-2 earnings did not reach the full-time earnings threshold worked on average 35 weeks over the previous 12 months and on average 985 total hours during this period. This contrasts with the average 49 weeks and 1,985 total hours of individuals whose W-2 earnings exceeded this threshold. We foreground our full-time imputation because it can be applied to both the full W-2 and ACS matched samples and does not suffer from the sampling issues associated with the ACS matched data.
Occupation
Self-reported occupations from the ACS are coded by trained Census Bureau coders into one of 520 three-digit categories from the Standard Occupation Classification system. Because occupation is reported by employees and then coded by Census Bureau workers, there is some measurement error relative to employer job titles as well as potential slippage when the most recent job is not the highest paid job in the past year.
Job
Our job concept is the intersection of detailed occupation and firm. For some firms, particularly larger ones, because of the potential for even finer detail in job distinctions at the workplace level, this measure may not match the actual job concept used by employers. For this reason, our estimates of within-job gender pay gaps may be higher than those based on job titles, at least for larger multisite firms. On the other hand, our measurement of jobs on the basis of detailed occupation is likely to be quite close to the concept of performing the same or very similar work. More concerning is that we observe job pay gaps only when a man and a woman work in the same detailed occupation in the firm. Thus, our observations at the job level are biased toward larger firms.
Age and Age Squared
We use Social Security Administration measures of age and its square to adjust for career stage in both the W-2-only and ACS samples.
Education
For the matched ACS sample, we have self-reports of employees’ education levels, which we measure as a series of indicator variables for five levels of education: less than high school, high school graduate or equivalent, some college or associate’s degree, bachelor’s degree, and graduate or professional degree.
Unobserved Covariates
We lack direct estimates of employee total labor force experience, firm tenure, and performance. The latter two are unlikely to bias national estimates. A meta-analysis of all published work shows no mean gender differences in performance evaluations (Joshi, Son, and Roh 2015). Similarly, during our observation period there are no mean gender differences in employee tenure with their current employer (Bureau of Labor Statistics 2020). There are, however, average gender differences in total labor force experience and these may be consequential for estimates (Foster et al. 2020).
Statistical Estimation
The unit of analysis is an employee-employer match, sometimes called a job spell, which we refer to as an individual or worker. We focus on two logged measures as dependent variables: yearly earnings and hourly earnings. As is conventional, regression coefficients on the gender indicator are interpreted as the proportional relative difference between average male and female earnings. More formally these estimates refer to the difference in relative geometric means for unlogged earnings (see discussion in Petersen 2017).
We focus on the relative impact of firm, occupation, and job segregation on these gender gap estimates, replicating our analysis for all job spells, for the highest paid job for workers with multiple jobs, and for full-time workers only. We further distinguish between all workers and those in the prime earning years, which we define as ages 30 to 55. For the W-2 population analyses we focus on the impact of firm segregation. For the ACS matched samples, we additionally explore occupation and job segregation.
Our core analyses focus on four sets of ordinary least squares regression models. The first model adjusts only for individual-level covariates and provides our baseline estimate of the overall gender pay gap. In subsequent models we compare only women and men who work in the same firm (model 2), only women and men who work in the same occupation (model 3), and only women and men who work in the same job (i.e., occupation-firm unit; model 4). We estimate these models separately by year, allowing us to examine trends in pay gap components. Comparing the results of these four models enables us to see the degree to which gender differences in pay in any given year are accounted for by sorting across occupations, firms, and occupation-firm units.
The equations estimated for these four models follow the same general form, using four different specifications:
and
where the subscripts represent
To address concerns regarding the comparability of full- versus part-time workers, we consider full- versus part-time status as a defining characteristic of a job and include this axis in constructing fixed effects for all models. Thus, model 1 includes the term η
Importantly, the analytic sample for each fixed-effects model is restricted to gender-integrated units. This is a necessary aspect of the estimation strategy, as we compare only men and women at risk for having different within unit earnings. The subscripts to the θ parameters indicate that these are different coefficients, pertaining to different levels: baseline (
To compare the pay of women and men in the same occupation and firm, it is important to have good coverage of employees within firms. The W-2 sample, which includes nearly all individuals in the workforce, provides such coverage. The matched ACS to W-2 sample observes only 10 workers in the median firm. Thus, we must be concerned about sparseness created by ACS sampling. For example, for W-2 population estimates, restricting the sample to gender-integrated firms reduces the sample by only 4 percent, but in the ACS matched sample, sample size is reduced between 30 percent and 40 percent. At the job level the sample is further reduced by some 35 percentage points. Some of this represents actual segregation between firms and jobs, but most reflects sampling constraints introduced by the ACS match.
Given the central limit theorem, sampling errors should be randomly distributed, and so we proceed with these sparse estimations of national gender gaps and component decompositions. Still, we run some risk of overestimating firm segregation components of the gender gap in the ACS hourly earnings analyses. The firm component of the gender gap averages 5 percent larger in the ACS than the W-2 estimations. The within-firm decomposition of job versus firm components will tend to be dominated by larger firms with multiple person observations at the job level. We return to issues of sampling coverage in the discussion.
Results
We begin in Table 1 by documenting the gross gender pay gap for logged yearly and hourly earnings for various population definitions in 2015, our most recent estimate year. The first column shows estimates for the total population of employment relationships in the United States reported to the IRS. All jobs and all workers are our most inclusive population, numbering 235.3 million, while restricting to the highest paid “main” job reduces the sample to 160.2 million. The bottom two rows restrict the samples to prime-age workers.
Coefficients on Female Indicator Variable in Earnings Models Net of Controls for Age, Age Squared, and Marginal Earnings Indicator in 2015: Varying Samples and Earnings Measures.
Using the conventional approximation of the percentage gap as 100 × (
Figure 1 reports trends in earnings differentials over time for the W-2-only sample for each of these populations and adds estimates of the gender pay gap net of a firm fixed effect. The general pattern is that the gender pay gap for all populations was declining until 2009, most dramatically for samples restricted to the primary job. After 2009 the gender gap rises for most populations, but is relatively flat among prime-age workers when second jobs are included, suggesting that women added more second jobs than men during and after the Great Recession. For each population, introducing firm fixed effects reduces the gender earnings gap, indicating that between-firm segregation is an important source of gender earnings inequalities. This firm segregation effect is most dramatic for the main job. Furthermore, although overall gender gaps tend to worsen from 2009 to 2015, the within-firm gender gaps all decrease, confirming the increasing importance of between-firm segregation to the overall gap over time.

Estimated gender log yearly earnings gaps, varying W-2-only samples.
Figure 1 shows trends corresponding to Table 1’s first column. Figure A1 shows time series corresponding to Table 1’s second column, including a control for part-time versus full-time earnings. The results also show a worsening overall gap with a decreasing within-firm gap between 2009 and 2015, indicating the increasing importance of between-firm segregation even after accounting for full-time earnings level. 3 All regression results for the W-2-only samples are reported in Table A1.
We now move to our more fully specified segregation analyses, focused on four sets of ordinary least squares regression models fit to the W-2-ACS matched sample. The first model provides our baseline estimate of the overall gender pay gap net of age, age squared, education, full- versus part-time job, and marginal job status. Thus, our segregation analysis begins after adjustment for individual characteristics and two proxies for hours worked. In subsequent models, we compare only women and men who work in the same firm (model 2), only women and men who work in the same occupation (model 3), and only women and men who work in the same job (i.e., occupation-firm unit; model 4). Comparing the results of these four models enables us to see the degree to which average gender differences in pay in any given year are explained by sorting across firms, occupations, and occupation-firm units. We estimate these models separately by year.
Table 2 produces the basic results of the fixed-effects analyses limited to the main job of workers 30 to 55 years old. Both firm and occupational segregation are important sources of the gender pay gap. In 2005 the residual pay gap in all years is marginally smaller for occupation (−0.234 or 20.9 percent) than for firm (−0.255 or 22.5 percent), suggesting that occupational segregation is a slightly more important source of gender pay disparities than firm segregation. This pattern is the same across all years, although the residual gender pay gap within firms drops further and continuously across the time period. Within occupation gender pay gaps decline from 2005 to about 2013, with a slight rise thereafter. Within-job (firm-by-occupation) pay gaps are considerably smaller and tend to be about half the magnitude of the baseline pay gap.
Trends in U.S. Gender Yearly Earnings Gaps for Workers Aged 30 to 55 Years, Primary Job, Controlling for Age, Age Squared, Education, Full-Time Earnings Threshold, and Marginal Job Indicators: Without (Baseline) and with Fixed Effects for Firm, Occupation, and Firm by Occupation.
Figure 2 displays the same information as Table 2 but instead highlights the reduction in the baseline pay gap associated with including the successive fixed effects as a percentage of the baseline gender pay gap. This serves as our indicator of the extent to which between-context (job, firm, occupation) segregation contributes to the baseline gap. In Figure 2 we see that within-job pay differences and between job segregation are roughly equivalent in importance through 2010, but thereafter job segregation becomes a marginally larger component of the total gender earnings gap. Occupational segregation alone explains about 30 percent of the gender earnings gap, rising slowly over time. Firm segregation explains less, dropping to a low of 20 percent during the Great Recession, but rises dramatically thereafter, with firm contributions to pay gaps almost converging with occupation by 2014.

Percentage of the baseline gender yearly earnings gap associated with between-firm, between-occupation, and between-job (firm-by-occupation) pay differences, with remainder within jobs for prime-age workers, 2005 to 2015.
Table 3 shows the hourly earnings results for workers 30 to 55 years of age in their main jobs. Again, we find that gender segregation between high- and low-earnings firms and occupations produces substantial portions of the gender pay gap. Both the firm and occupation segregation effects grow over time, although the firm component grows at a faster pace. Within-job (occupations within firms) earnings gaps range between 7.3 percent and 8.5 percent, showing no dramatic changes over time. Figure 3 reports the reduction in the baseline gap following inclusion of job, firm, and occupation fixed effects as a percentage of the baseline gender pay gap. About half of the gender earnings gap is produced by the intersection of firm and occupation (i.e., job segregation). The remaining half is between men and women in the same firm working in the same detailed occupational titles. Job segregation is an increasing source of the gender earnings gap during the Great Recession. As with annual earnings, the between-firm component of the overall gap rises significantly between 2009 and 2015 for hourly earnings.
Trends in U.S. Gender Hourly Earnings Gaps for Workers Aged 30 to 55 Years, Primary Job, Controlling for Age, Age Squared, Education, Full-Time Earnings Threshold, and Marginal Job Indicators: Without (Baseline) and with Fixed Effects for Firm, Occupation, and Firm by Occupation.

Percentage of the baseline gender hourly earnings gap associated with between-firm, between-occupation, and between-job (firm-by-occupation) pay differences, with remainder within job for prime-age workers, 2005 to 2015.
The Appendix includes analyses fit to the wider sample of workers aged 16 and older. Tables A2 and A3 can be compared with Tables 2 and 3, respectively, and show that the estimated gaps are smaller for the larger population. Figures A2 and A3, corresponding to Figures 2 and 3, display the same general trends in the proportion of the gap explained by firm, occupational, and job segregation over time.
Discussion and Conclusion
Examining linked employer-employee data that locate workers within firms as well as within occupations suggests that firm and occupational segregation are both important sources of U.S. average gender pay gaps, and that firm segregation is increasingly important over time. The significance of between-firm segregation is consistent with research showing that earnings inequality growth in the United States more generally is primarily a between firm phenomena (Song et al. 2019). The within-job gender pay gap is about half of the baseline gender pay gap for both yearly and hourly earnings, with the other half associated with job-level segregation.
These estimates are broadly consistent with recent high-quality research on the gender pay gap (e.g., Barth et al. 2021; Blau and Kahn 2017; Foster et al. 2020). One area of concern relative to that literature is that we do not have individual level measures of cumulative labor force experience in our models. In the United States, we know that the rise of very long hours work has increased the gender pay gap (Cha and Weeden 2014). The difference between the size of the within job earnings gap for yearly and hourly earnings confirms that gender differences in hours worked is a major driver of U.S. gender pay disparities even within the same job.
There is considerable evidence in these analyses that within-job gender pay gaps are quite high in the contemporary United States. The hourly earnings estimate of within-job pay gaps hovers around 8 percent, while the yearly earnings gap averages around 14 percent. Given the sampling limitations associated with the ACS match, this conclusion holds most clearly for larger firms, at which we are more likely to observe gender-integrated jobs. As we do know that larger firms tend to pay higher earnings but that women receive less of a pay premium in larger firms (Hollister 2004), it seems likely that our matched W-2-to-ACS sample produces larger average within-firm pay gaps than in the population of all jobs. Larger firms are also more likely to have gender-integrated jobs (Tomaskovic-Devey, Kalleberg, and Marsden 1996), thus our matched sample is also likely to overestimate within-job pay gap components relative to within-firm occupational segregation. The sizes of these overestimations are a matter of speculation, but we suspect that they are not so large as to challenge our comparisons with earlier estimates. Relative to actual employer job titles, these are nearly certainly overestimates of within-job pay gaps, but again we do not know whether the magnitude is small or large.
Compared with past U.S. within-job gender gap estimates, our estimate is considerably smaller than the 16.2 percent 1990 lower-bound estimate (Bayard et al. 2003) and considerably larger than the 1980-era estimates of 0 percent to 3 percent (Groshen 1991; Petersen and Morgan 1995). All three studies use very large sample linked employer-employee data but diverge on other dimensions. Like the 1990 estimate, our measure of earnings is relatively inclusive: overtime, shift differentials, and bonuses are all included. The earlier papers used measures of contractual hourly or weekly pay and lack these various forms of supplemental wages. Thus, our higher within-job estimate may reflect that men have more access to within-job wage supplements of various types. We also know that high-performance work, merit pay, and bonus pay practices have all become more prevalent since 1980 (Cappelli 1999) and that these practices have been associated with rising gender earnings inequalities (Castilla 2012; Davies, McNabb, and Whitfield 2015; Drolet 2002; Elvira and Graham 2002). We suspect that the larger within-job gender pay gap in our estimates relative to 1980 estimates may reflect our potentially less precise job measure, the more inclusive earnings measure, and the rising use of various forms of supplemental pay. It is also plausible that as the human capital differences between men and women shrank and occupational segregation declined that within-job gender distinctions became more salient and within job bias processes grew. Reskin (1988) and Tilly (1998) both made the prediction that when inequality-installing mechanisms decline in legitimacy or effectiveness, others may emerge to reinstall inequalities.
Clearly, with better data we could improve estimates of the relative size of segregation and within job components. Administrative data holds the most promise in this regard. Estimates would be enhanced considerably if the occupation self-reports in IRS annual tax filings were merged with W-2 records so that we need not rely on relatively sparse within workplace sample data for job analyses. Even better would be if W-2 filings by employers included occupation codes as is common in many other countries (Penner et al. 2023). Most ambitiously, we concur with a recent conclusion from the National Academies of Sciences, Engineering, and Medicine (2022) recommending that future pay data collection by the Equal Employment Opportunity Commission collect individual-level pay data; sex, race, and ethnicity information; job titles; and precise measures of hours and weeks worked as well as firm-specific tenure. Such data would go a long way toward improving both scientific estimations and regulatory analyses of firm-level processes.
Establishing an average gender (or between race or ethnic group) national pay gap is only a beginning. Occupations vary a great deal in the sizes of their gender pay gaps, with some displaying very large gaps and others near gender equality (Foster et al. 2020). We suspect that this is true at the firm and job levels as well. Studies using administrative data from Portugal, Germany, France, and Norway show large gender pay gap variation among workplaces (Abowd, Kramarz, and Roux 2006; Barth and Dale-Olsen 2009; Card et al. 2016; Tomaskovic-Devey and Avent-Holt 2019). Linked employer-employee data for seven U.S. federal government science agencies show agency variation in the levels of gender pay disparity, the mechanisms that produce them, and trends over time (Smith-Doerr et al. 2019). Examining workplace heterogeneity in the extent of pay disparities and the mechanisms producing them is an obvious next step if we are to better understand the generation of gendered (or race/ethnic) earnings inequalities.
Linked employer-employee data have many applications beyond studying gender inequalities. For example, they can be used to decompose earnings inequalities into individual and workplace components (Song et al. 2019), explore workplace variation in immigrant-native inequalities (Tomaskovic-Devey, Hällsten, and Avent-Holt 2015), demonstrate the role of networks in career mobility (Collet and Hedström 2013), and show firm variation in scheduling practices (Storer, Schneider, and Harknett 2020), to name just a few applications. Linked employer-employee data may provide the means to finally fulfill Baron and Bielby’s (1980) call to “bring the firm back in.”
From the point of view of regulatory targeting of firms for segregation or within-job disparities it is this firm-level variation, rather than the national mean gender pay gap, that is of primary concern. It seems quite likely given our estimates that in most firms, job segregation is the more important source of gender pay gaps but also that both job segregation and within-job pay gaps will vary greatly from firm to firm. Since the Equal Pay Act and the Civil Rights Act prohibit both employment segregation and job-level pay discrimination, it seems that it is long past time for both social scientists and equal opportunity regulators to develop and take advantage of workplace level data.
England et al. (2020) documented the stalled progress toward gender equality in the United States and identified three necessary policy interventions to move toward a more gender equal society: increased access to publicly supported child care and men’s participation in household work on the supply side and reduced discrimination by employers on the demand side. We show for the first time that within-job gender earnings gaps are stably high. It is also well documented in prior research that both occupational and within-firm gender segregation have also been stably high in recent decades (Stainback and Tomaskovic-Devey 2012; Zhu and Grusky 2022). Although our results are silent on the supply side, they confirm the need for changes in employer hiring and pay practices if the United States is to move toward a more gender equalitarian society.
Supplemental Material
sj-docx-1-srd-10.1177_23780231231157678 – Supplemental material for Estimating Firm-, Occupation-, and Job-Level Gender Pay Gaps with U.S. Linked Employer-Employee Population Data, 2005 to 2015
Supplemental material, sj-docx-1-srd-10.1177_23780231231157678 for Estimating Firm-, Occupation-, and Job-Level Gender Pay Gaps with U.S. Linked Employer-Employee Population Data, 2005 to 2015 by Joseph King, Matthew Mendoza, Andrew Penner, Anthony Rainey and Donald Tomaskovic-Devey in Socius
Footnotes
Appendix
Trends in U.S. Gender Hourly Earnings Gaps for Workers Aged 16 and Older, Primary Job, Controlling for Age, Age Squared, Education, Full-Time Earnings Threshold, and Marginal Job Indicators: Without (Baseline) and with Fixed Effects for Firm, Occupation, and Firm by Occupation.
| 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Baseline model | |||||||||||
| Coefficient | −.136 | −.138 | −.134 | −.129 | −.118 | −.138 | −.131 | −.130 | −.127 | −.129 | −.132 |
| |
(.110) | (.108) | (.111) | (.119) | (.120) | (.110) | (.112) | (.112) | (.111) | (.111) | (.111) |
| |
1,612,000 | 1,696,000 | 1,655,000 | 1,682,000 | 1,597,000 | 1,584,000 | 1,721,000 | 1,965,000 | 1,871,000 | 1,971,000 | 1,975,000 |
| |
.338 | .342 | .356 | .388 | .400 | .379 | .368 | .363 | .361 | .362 | .363 |
| Firm fixed effects | |||||||||||
| Coefficient | −.125 | −.124 | −.122 | −.121 | −.114 | −.120 | −.105 | −.105 | −.102 | −.103 | −.104 |
| |
(.003) | (.003) | (.003) | (.003) | (.003) | (.003) | (.003) | (.003) | (.003) | (.003) | (.003) |
| |
988,100 | 1,051,000 | 1,023,000 | 1,044,000 | 991,200 | 985,400 | 1,084,000 | 1,262,000 | 1,199,000 | 1,271,000 | 1,275,000 |
| |
.514 | .517 | .526 | .557 | .565 | .547 | .549 | .537 | .535 | .536 | .538 |
| Occupation fixed effects | |||||||||||
| Coefficient | −.108 | −.110 | −.107 | −.104 | −.095 | −.107 | −.094 | −.094 | −.094 | −.094 | −.099 |
| |
(.009) | (.009) | (.009) | (.009) | (.009) | (.009) | (.009) | (.009) | (.008) | (.009) | (.008) |
| |
1,612,000 | 1,696,000 | 1,655,000 | 1,682,000 | 1,597,000 | 1,584,000 | 1,720,000 | 1,964,000 | 1,871,000 | 1,970,000 | 1,974,000 |
| |
.415 | .421 | .432 | .462 | .473 | .449 | .440 | .434 | .428 | .429 | .432 |
| Firm-by-occupation fixed effects | |||||||||||
| Coefficient | −.081 | −.081 | −.080 | −.084 | −.075 | −.082 | −.076 | −.075 | −.076 | −.073 | −.079 |
| |
(.003) | (.003) | (.003) | (.003) | (.003) | (.003) | (.003) | (.003) | (.003) | (.002) | (.003) |
| |
423,300 | 453,800 | 443,600 | 449,900 | 435,500 | 434,200 | 464,600 | 551,100 | 510,500 | 538,200 | 544,800 |
| |
.608 | .614 | .621 | .653 | .664 | .645 | .652 | .643 | .641 | .641 | .644 |
Acknowledgements
This article is released to inform interested parties of research and to encourage discussion. The views expressed are those of the authors and not necessarily those of the Census Bureau. Tabular materials presented in this article were approved for release by the Census Bureau’s Disclosure Review Board (CBDRB-FY18-258). This article has benefited from the comments of Karen Brummond, Aleksandra Kanjuo-Mrčela, Alena Křížková, and anonymous reviewers from this and other journals.
Data Availability
All data available to evaluate this article are included in the article and its
. The source data used to create the estimates presented in this article are highly confidential and can be accessed only by Census Bureau employees with permission from the IRS. Generic replication code is included in the online supplement.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the National Science Foundation (awards 0525831 and 1852756) and the Alexander von Humboldt Foundation (AR8227).
Supplemental Material
Supplemental material for this article is available online.
1
Specifically, Table 1 is entirely novel; Figure 1 contains results for four population definitions, only one of which appears in an online supplement in prior work (Penner et al. 2023, Table S21); and the hourly earnings trends in Table 3 and Figure 3 are novel (prior work reported only the results for a single year). Table 2 presents results previously reported in an online supplement (Penner et al. 2023, Table S20 and Figure S18). Figures A1 to
and Tables A1 to A3 are all new.
2
We calculate this 17.4 percent as
:798–99) reported 2010 unadjusted Panel Study of Income Dynamics gender hourly wage gap of 20.7 percent reduced by the 15.9 percent attributed to gender differences in experience in their decomposition (20.7 − [0.159 × 20.7] = 17.4), experience being the closest adjustment to our age adjustment.
3
These series suggest less of a role for between-firm segregation in explaining earnings gap once full-time earnings is accounted for. This would be expected if firm segregation proceeds on the basis of earnings levels or hours worked: high-paid women work with high-paid women, while low-paid women work with low-paid women.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
