Abstract
Debates on how sex, gender, and sexual identity relate to intimate partner violence (IPV) are longstanding. Yet the role that measurement plays in how we understand the distribution of IPV has been understudied. We investigated whether people respond differently to IPV items by sex and sexual identity and the implications this has for understanding differences in IPV burdens. Our sample was 2,412 randomly selected residents of Toronto, Canada, from the Neighborhood Effects on Health and Well-being (NEHW) study. IPV was measured using short forms of the Physical and Nonphysical Partner Abuse Scales (20 items). We evaluated the psychometric properties of this measure by sex and sexual identity. We examined whether experiences of IPV differed by sex and sexual identity (accounting for age and neighborhood clustering) and the impacts of accounting for latent structure and measurement variance. We identified differential item functioning by sex for six items, mostly related to nonphysical IPV (e.g., partner jealousy). Males had higher probabilities of reporting five of the six items compared to females with the same latent IPV scores. Being female and identifying as lesbian, gay, or bisexual were positively associated with experiencing IPV. However, the association between female sex and IPV was underestimated when response bias was not accounted for and outcomes were dichotomized as “any IPV.” Common practices of assuming measurement invariance and dichotomizing IPV can underestimate the association between sex or gender and IPV. Researchers should continue to attend to gender-based and intersectional differences in IPV but test for measurement invariance prior to comparing groups and analyze scale (as opposed to binary) measures to account for chronicity or intensity.
Keywords
Introduction
Effective prevention of intimate partner violence (IPV) requires reliable measurement of this violence (Craig et al., 2008). Yet, how to measure and interpret the burden of IPV has been the focus of decades-long debates. Measurement controversies include: the validity of different data sources (e.g., crime data vs. general surveys); measuring discrete incidents as opposed to patterns of violence; focusing on physical or sexual violence to the exclusion of psychological violence; analyzing types of IPV separately; and whether and how intent, consequence, or severity of violence should be incorporated into measurement (Heise et al., 2019; Holtzworth-Munroe & Meehan, 2004; Johnson, 1995; Walby et al., 2017; Yakubovich et al., 2019). The reporting of IPV may also be biased, so that regardless of actual experiences of IPV, internalized factors (e.g., normative expectations, socialized roles) influence people’s interpretation of their experiences, their willingness to report, and ultimately how they respond to different measures (Jewkes et al., 2015). In turn, theoretical debates persist around whether and how IPV is gendered, including the extent of, and differentiation in, burdens of IPV by sexual or gender identity (Johnson, 2011; Kimmel, 2002; Peitzmeier et al., 2020; Straus, 2010). These measurement and theoretical issues are inextricably linked because different measurement and operationalizations of IPV produce different understandings of the gendered (or not) nature of this violence. This study seeks to demonstrate this linkage and advance a way forward in these debates using a novel application of psychometric methods to a unique Canadian dataset on IPV.
Canada offers a useful case study for these debates, showing how different ways of operationalizing IPV can meaningfully impact results across the field (Dragiewicz & Dekeseredy, 2012). National statistics have historically relied on dichotomized (yes/no) items of physical and sexual IPV from the Conflict Tactics Scale (Straus et al., 1996) and dichotomized items of financial and psychological IPV (unvalidated scale) (Burczycka & Ibrahim, 2016). Analyses of these data have tended to dichotomously operationalize IPV (any IPV), finding symmetric prevalence among women and men (Lysova et al., 2019) and a higher prevalence of IPV among lesbian, gay, or bisexual-identified people compared to heterosexual-identified people (Burczycka & Ibrahim, 2016). In contrast, analyses of these same data that accounted for the severity or patterns of violence, either by using observed data on counts or impacts (Lysova et al., 2019; Romans et al., 2007) or estimating underlying or latent patterns of IPV (Ansara & Hindin, 2010), have found that women experienced more intense and chronic patterns of IPV compared to men. These analyses, however, have not considered sexual identity nor whether the groups being compared have systematically different responses to IPV measures. The Canadian example thus raises critical questions around measurement and analysis methods that are commonly practiced in IPV research and beyond (Potter et al., 2020; Yakubovich et al., 2019). This includes how failing to attend to gender-based and intersectional differences in the chronicity or intensity of violence, as well as willingness to report experiencing different types of violence, may distort our understanding of IPV burdens.
Psychometric analyses can inform best practices in IPV measurement and operationalization but have been underused in IPV research (Ansara & Hindin, 2010; Martin-Fernandez et al., 2019; Wareham et al., 2021; Yount et al., 2014a, 2014b). Valid between-group comparisons (e.g., by sex) in IPV burdens require that there are no systematic differences in the ways that people are responding to the measure; the same construct has to be measured in the same way among groups before we can validly compare their scores on that construct (Vandenberg & Lance, 2000). When such measurement invariance does not hold, it means that observed differences may be due to differences in the ways that people have interpreted or responded to the items (e.g., due to normative expectations or socialized roles) as opposed to real differences in experiences of violence between groups. In other words, in such cases, measurement bias confounds true group differences in latent scores.
Using psychometric analyses, we can consider overall measurement invariance based on whether items are measuring distinct dimensions of IPV and contributing to the latent construct(s) of IPV to a similar degree across groups (Vandenberg & Lance, 2000). We can also examine whether people from different groups with the same latent IPV scores respond in different ways to the items, including on an item-by-item basis—this is known as differential item functioning (DIF) and is often not considered in scale validation (Martinkova et al., 2017; Shealy & Stout, 1993). Evidence of DIF demonstrates that people are responding differently to an indicator of IPV due to some group characteristic(s) beyond real differences in the underlying construct of IPV (i.e., conditional on their latent scores). For instance, a classic example of DIF is that women are more likely to endorse having “crying spells” compared to men, independent of their underlying depression scores (e.g., due to cultural norms around the acceptability of crying among men) (Teresi et al., 2008). Not accounting for this DIF (i.e., that crying is easier for women to endorse than men with the same depression scores) in analysis will falsely underestimate men’s true levels of depression compared to women.
Examining the measurement of IPV experiences by both gender and sexual identity is an important line of inquiry for future epidemiologic research on IPV given the evidence for the role of sex and gender in the severity of violence, as well as in the interpretation and manifestation of relational and personal experiences (e.g., via gender norms) (Dragiewicz & Dekeseredy, 2012; Kimmel, 2002; Romans et al., 2007; Whitehead et al., 2020; Yakubovich et al., 2019). Yet, to our knowledge, there has been no such investigation in the field using robust psychometric methods. We therefore aimed to investigate the latent structure of IPV and measurement invariance by gender (using sex as a proxy) and sexual identity, using a random sample from a diverse Canadian urban center, and draw implications for how we measure and analyze social inequities in this violence.
Methods
We used data from the Neighborhood Effects on Health and Well-being (NEHW) study in Toronto (Canada’s largest city) (O’Campo et al., 2015). The study systematically randomly sampled 50 of Toronto’s 140 neighborhood planning areas and then randomly selected two census tracts from each of the 50 areas. Finally, households (based on residential address) were randomly selected and screened within each census tract (mean [M] = 27 residents per census tract). One resident per household was selected and screened based on the following inclusion criteria: aged 25-64 years, able to communicate in English, and lived in the census tract for at least six months. The final sample included 2,412 residents (response rate = 77%). Face-to-face interviews were conducted between March 2009 and June 2011. Participants provided written informed consent. The St Michael’s Hospital Research Ethics Board provided ethical approval for this study.
IPV
Participants completed the Hurt/Insult/Threaten/Scream (HITS) screening tool for IPV within the last 10 years. Participants indicated whether they had a partner in the previous 10 years who had physically hurt, insulted or talked down to, threatened to harm, or screamed or cursed at them plus an added fifth item of whether a partner had restricted their actions. Those who responded affirmatively to any of the five items (37%) completed abbreviated versions of the Partner Abuse Scales (Attala et al., 1994) covering physical (e.g., my partner pushes and shoves me around violently), psychological (e.g., my partner belittles me), and sexual violence (e.g., my partner physically forces me to have sex) from a current or former partner (all 20 items summarized in the results section). Participants indicated the frequency of experiencing each item on a 6-point Likert scale (0 = never to 5 = all the time).
Covariates
Participants indicated their sex assigned at birth (0 = male or 1 = female), which we conceptualized as a proxy for gender. Risk of misclassification is low given that Toronto population estimates suggest 0.5% prevalence of transgender identity (Fleiszer et al., 2019). Participants indicated their sexual identity as 0 = heterosexual or straight, 1 = gay or lesbian, 2 = bisexual, or 3 = some other way. We operationalized sexual identity as 0 = heterosexual and 1 = lesbian, gay, bisexual, or queer (LGBQ) due to low but proportionate numbers of participants in each of the latter subgroups and common theoretical drivers of IPV across LGBQ populations (Fleiszer et al., 2019; Rolle et al., 2018). We determined age based on date of birth.
Analytic Strategy
We conducted a three-stage analytic strategy in Mplus 8.4 (Muthén & Muthén, 2017). First, to compare our data with the original scale’s validation, we ran confirmatory factor analysis to determine the latent structure of the IPV measure. We compared a one-factor solution (for parsimony) to the original two-factor solution. Although some items in the Physical Partner Abuse Scale could be conceptualized as nonphysical violence (e.g., makes me afraid for my life) and some items in the Nonphysical Scale could be conceptualized as physical/sexual violence (e.g., demands I perform sex acts I do not like), we maintained consistency with the original scales in our two-factor solution for comparison’s sake. We estimated logistic models with full information maximum likelihood estimation with robust standard errors (Muthén & Muthén, 2017). We considered relative differences in model fit and the direction, magnitude, and standard error of item loadings and, for the two-factor solution, the correlation between the factors (r > .80 suggesting redundant factors) (Brown, 2015). Nonzero scores (i.e., “rarely” to “all the time” responses) were sparsely distributed (<5 participants per response for several items); therefore, for all latent factor analyses we used dichotomized items to prevent unreliable estimation.
Second, we evaluated measurement invariance by sex and sexual identity to establish the validity of between-group comparisons. After appraising global measurement invariance (methods and results in appendix), we considered partial measurement invariance. We fit two-parameter logistic models with robust maximum likelihood estimation. We evaluated uniform and nonuniform DIF by sex and sexual identity using a multiple indicator multiple cause model based on the Crane, van Belle, and Larson (CvBL) approach (full details in appendix) (Crane et al., 2007; Heron et al., 2012). We checked for uniform DIF (freeing item thresholds) followed by nonuniform DIF (freeing item thresholds and loadings). We then considered DIF by both sex and sexual identity by rerunning our models for sex, adding a direct effect of sexual identity on the latent IPV factor and an indirect effect of sexual identity on IPV via each item in a stepwise manner.
Finally, we analyzed the distribution of IPV by sex, sexual identity, and their interaction, adjusting for age. To determine the impact of accounting for latent structure and measurement variance, we ran three models using outcome operationalizations based on the observed data ([a] any IPV, [b] sum score of all IPV items, [c] sum score of all IPV items that did not exhibit DIF) and three models based on latent factor scores ([a] latent score using all IPV items, assuming measurement invariance, [b] latent score using all items, accounting for partial measurement invariance, [c] latent score using only non-DIF items). Models were multilevel generalized linear models accounting for clustering by census tract—distributions depended on whether the outcome was any IPV (logistic), sum scores (negative binomial), or latent scores (normal).
Results
Sample Characteristics.
Note. aFree text specified as bicurious (n = 1), heteroflexible (n = 1), open (n = 1), queer (n = 3), or did not want to specify (n = 1).
Prevalence of Each IPV Item by Sex and Sexual Identity.
Note. aGay or lesbian, bisexual, or some other way (free text specified as bicurious [1], heteroflexible [1], open [1], queer [3], or did not want to specify [1]).
Evaluating Latent Factor Structure and Measurement Invariance
A single factor solution (assuming that all items load onto a single latent construct) fit the data very well, with items loading highly onto the latent factor (standardized loadings: 0.89-0.96) with small standard errors (0.01-0.03). The two-factor solution (assuming separate latent constructs for physical and nonphysical IPV) only marginally improved model fit relative to the one-factor model (e.g., the Bayesian Information Criterion, BIC, decreased) and the correlation between the two factors was very high (r = .95). We therefore adopted the one-factor solution in all further analyses for the sake of parsimony. One item had zero variation for males (no males reported that their partner had tried to suffocate them) and was therefore dropped from further analyses. The appendix provides additional analyses of the latent structure and global measurement invariance between groups.
After establishing the unidimensionality of the scale, we explored allowing partial measurement invariance through an analysis of DIF by sex, sexual identity, and their interaction. Table 3 shows the item discriminations (loadings) and difficulties (related to thresholds) by sex in the final two-parameter logistic model. Across males and females, all items were highly discriminating and most items were more difficult (i.e., required a higher latent IPV score) to report that they occurred; this indicates that the scale is most accurate in its measurement of more severely unhealthy relationships. There were six items that showed uniform DIF by sex (differences in items difficulties; Figure 1). Males, compared to females with the same values on the underlying IPV construct, had a higher probability of reporting five items: partner does not want me to have friends, does not want me to socialize with my family, screams or yells at me, is often jealous, and throws dangerous objects at me (Panels A-E, Figure 1). Females had a higher probability of reporting having had a partner who has made them afraid for their life compared to males (Panel F). No items showed nonuniform DIF (differences in discriminations and difficulties) by sex. We did not find evidence for uniform or nonuniform DIF by sexual identity nor differences in the sex-DIF results between heterosexual and LGBQ-identified participants.

Note. Item characteristic curves for the six items that showed differential item functioning (DIF) by sex, estimated from the final two parameter logistic model. Panels A-E indicate uniform DIF favoring males (dotted line): males found these items easier to endorse than females across all levels of the latent IPV construct (males’ item characteristic curves shifted to the left of females’). Panel F indicates uniform DIF favoring females (solid line): females found this item easier to endorse than males across all levels of the latent IPV construct (females’ item characteristic curve shifted to the left of males’).
Item Discriminations and Difficulties From the Two-Parameter Logistic IRT Model by Sex.
Note. Item discriminations correspond to item loadings in traditional factor analysis and indicate how strongly correlated the item is to the latent construct of intimate partner violence. High item discriminations indicate that the item tells us more information about participants’ total IPV scores (but only across the range of IPV scores for which the item is most informative). Item difficulties correspond to item thresholds and indicate the latent IPV score at which participants are more likely to endorse the item. Higher item difficulties indicate that participants need to have a higher latent IPV score before they will endorse the item.
aItem showed differential item functioning (DIF) by sex.
IPV on Sex and Sexual Identity: The Impacts of Accounting for Latent Structure and DIF
The Impact of Accounting for Latent Structure and DIF on the Estimated Associations Between IPV and Each of Sex, Sexual Identity, and Their Interaction.
Note. IPV is intimate partner violence. DIF is differential item functioning. LGBQ is lesbian, gay, bisexual, or queer. All models were multilevel (accounting for clustering by census tract) and included age as a covariate (which was consistently negatively associated with IPV). Main effect models refer to models that only include the main effects of sex and sexual identity. Interaction effect models refer to models that include sex, sexual identity, and their interaction. Results shown are the nonexponentiated point estimate and the 95% confidence interval. All models assume measurement invariance unless otherwise noted. As shown in Table 3 and Figure 1, six items showed evidence of DIF; therefore, DIF adjustment refers to analyses that accounted for DIF in these six items, whereas non-DIF items refer to analyses that used an outcome variable created only from the 13 items without evidence of DIF.
aMultilevel logistic regression.
bMultilevel negative binomial regression. Negative binomial regression was used as opposed to Poisson regression as the overdispersion factor was statistically significant.
cMultilevel linear regression using maximum likelihood estimation with robust standard errors.
Discussion
This study found that being female and identifying as LGBQ were each positively associated with experiencing IPV in a random sample in Canada’s largest city. However, the size and precision of these associations depended on whether measurement variance was accounted for. To our knowledge, only one previous study has examined measurement invariance of IPV by sex/gender in Canada, which identified response bias but did not detail the findings (Ansara & Hindin, 2010). Internationally, studies have demonstrated measurement variance by sex/gender in attitudes toward IPV (Yount et al., 2014a, 2014b) and, more recently, IPV perpetration (Wareham et al., 2021). Our findings extend this evidence base to experiences of IPV.
Six of 20 IPV items showed response bias; holding their IPV scores constant, participants’ responses to these items systematically differed as a function of their sex. Independent of their latent IPV scores, males had a higher probability of reporting four items related to nonphysical IPV (e.g., partner jealousy) as well as partners throwing dangerous objects. These results map onto prior qualitative studies showing that women more often than men view nonphysical acts of IPV as controlling behaviors central to dynamics of violence (O’Campo et al., 2016). To the extent that male participants interpreted these items less severely than females, this may explain their greater likelihood of reporting these items. An additional study that compared DIF in women’s responses regarding their own use of violence and that of their male partners’ found that women were more likely to report that they threw objects as opposed to experiencing this (Reichenheim et al., 2007). These reporting differences may reflect gendered notions around the acceptability or severity of different forms of violence. In contrast, conditioning on latent IPV, female participants in our study were more likely to report that their partners made them fear for their lives compared to males. Previous studies have shown fear of partner to be one of the strongest gender differences in IPV self-reports (Yakubovich et al., 2019). Our results indicate that this is beyond gender differences in underlying IPV levels alone; rather, there are reporting differences on this item that may be exacerbated by, for instance, gender-based impacts of IPV (women are more likely to experience negative consequences of IPV) or constructs of masculinity (e.g., as strong and aggressive) versus femininity (e.g., as weak and subservient) (Yakubovich et al., 2019). Our study showed that ignoring item reporting biases by sex underestimated females’ higher IPV scores compared to males—demonstrating the importance of testing and accounting for measurement variance prior to drawing group-based comparisons in IPV, and other epidemiologic research.
We did not identify DIF by sexual identity, however, there were a small number of LGBQ-identified participants (5%). Although a higher proportion than Canadian population estimates (Burczycka & Ibrahim, 2016), this likely limited our statistical power to identify item response bias. Low power may have also impacted identifying DIF in rarer IPV items (e.g., suffocation). Replication should be attempted in larger samples as well as in other contexts and with other IPV measures. Although alternative IPV measures have been explored with sexual and gender minority populations (Stephenson & Finneran, 2013), analyses of social inequities in IPV burdens benefit from IPV measures that exhibit measurement invariance across groups (to compare “like with like”) (Martin-Fernandez et al., 2019; Shealy & Stout, 1993; Yount et al., 2014a). It would be valuable for future qualitative research to explicitly explore reasons for DIF by gender/sex and other social identities (Martinkova et al., 2017).
Despite low statistical power, our study extends previous work demonstrating the importance of sexual identity to understanding distributions of IPV (Badenes-Ribera et al., 2016; Finneran & Stephenson, 2013; Kimmes et al., 2017; Rolle et al., 2018). Prior research has predominantly been based in the United States, with a reliance on convenience or purposive samples. Only a small number of quantitative studies have demonstrated a similar or higher prevalence of IPV among same-sex couples or LGBQ-identified people compared to heterosexual populations in Canada (Burczycka & Ibrahim, 2016; Whitehead et al., 2020). Using a more robust measurement and analytic approach, our study adds to these to highlight the importance of nuancing conceptions of IPV to consider intersections between gender, sexual, and other social identities, including experiences of misogyny, colonialism, heterosexism, poverty, racism, transphobia, and other forms of structural violence (Peitzmeier et al., 2020; Ristock et al., 2017; Stark & Hester, 2019). This should be coupled with the continued broadening of services (as needed) to the unique needs of people experiencing IPV across all social locations and intersections, while recognizing that experiences and consequences of IPV are not equally distributed (Furman et al., 2017; Gingras, 2018).
Our results further demonstrate that dichotomizing IPV scales masks valuable information regarding the severity of this violence, even when these data are skewed toward zero (which they usually are) (Martin-Fernandez et al., 2019; Yakubovich et al., 2019). There is a theoretically meaningful difference between experiencing “any IPV” (which sexual identity appears more strongly predictive of) and “overlapping or more frequent acts of IPV” (which sex/gender appears more strongly predictive of): this matters to how we understand clinical burdens and design intervention strategies (Ford-Gilboe et al., 2016; Heise et al., 2019). Analysis of the measure’s latent structure indicated a single IPV construct that varies in chronicity and intensity, of which severe physical (including sexual) violence congregates at the upper end of the distribution, with certain forms of nonphysical violence congregating more toward the lower end. This further explains our analyses of dichotomized “any IPV”: sexual identity was more predictive than sex because LGBQ-identified participants (especially females) tended to experience more nonphysical violence; whereas sex was more predictive of count/continuous measures because females tended to experience more physical violence than males. These results also demonstrate the information loss that can result from analyzing types of IPV separately without considering underlying structural relationships—perhaps especially when working with short-form scales (Ford-Gilboe et al., 2016; Potter et al., 2020; Yakubovich et al., 2019).
In addition to the need to test for generalizability, discussed above, study limitations include only having access to participants’ sex assigned at birth (gender identity was not measured in the NEHW study) (O’Campo et al., 2015). Although a viable proxy for gender given our use of a random community sample and low prevalence estimates of transgender identity in the source population (Fleiszer et al., 2019), future research should examine measurement invariance across the diversity of gender identities—particularly in light of a growing body of research demonstrating the disproportionate burden of IPV among transgender populations (Peitzmeier et al., 2020). We also do not know the sex or gender of participants’ partners or the consequences of this violence, which would be useful to consider in future research.
Summary of Implications
This study offers important context to debates around sex and gender symmetry in IPV. Females and LGBQ-identifying participants experienced more IPV, but the association between sex and IPV was underestimated when measurement variance in this construct was not accounted for. The operationalization of IPV had further consequences for understanding the distribution of IPV, with analyses of summative or latent scores clarifying the relationships between sex, sexual identity, and different severities and types of IPV in contrast to analyses of “any IPV” alone. We therefore recommend that, prior to drawing group-based comparisons in IPV, and other complex epidemiologic outcomes, researchers establish at least partial measurement invariance between groups and avoid creating outcome typologies or dichotomies without investigating latent structure. Moreover, the extent to which existing studies on sex- and gender-based differences in IPV have followed these recommended measurement and analytic methods should be interrogated to gauge the potential role of bias in their conclusions on sex or gender symmetry. This is particularly important in contexts, like Canada, where national statistics on IPV have often used gender neutral frameworks that combine and dichotomize experiences of IPV. Practice and policy should aim to prevent and respond to IPV among all groups of people; however, strategies must remain responsive to the unequal distributions in this violence by sex, gender, sexual identity, and other social factors.
Supplemental Material
Supplemental material for this article is available online.
Supplemental Material for Measuring the Burden of Intimate Partner Violence by Sex and Sexual Identity: Results From a Random Sample in Toronto, Canada by Alexa R. Yakubovich, Jon Heron, Nicholas Metheny, Dionne Gesink and Patricia O’Campo, in Journal of Interpersonal Violence
Footnotes
Declaration of Conflicting Interests
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Canadian Institutes for Health Research (Grant number MOP-84439 and to ARY HSI-166388) and the Social Science and Health Research Council (Grant number 410-2007-1499).
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
