Sage Journals: Discover world-class research

Abstract

The Couples Satisfaction Index (CSI) is one of the most widely used instruments to assess intimate relationship satisfaction and status. However, its performance in the Chinese population has yet to be validated, including investigation of potential differential item functioning (DIF). To verify the performance of the adapted Chinese CSI, data were collected from 740 participants (235 males and 505 females). Under the item response theory (IRT) framework, the graded response (GR) model was fit. Key assumptions and model fit were checked first, followed by a DIF analysis. Category response curves and information functions were also examined. Results showed that the two IRT assumptions—unidimensionality and local independence—were generally satisfied. The GR model fit the data well. Moreover, the items from the CSI scale performed very well in assessing and differentiating among participants of differing levels of intimate relationship satisfaction. Meanwhile, high test information distributed across a wide range of latent ability ensured that the CSI was reliable and accurate. Moreover, there was no significant DIF for all CSI items, which supported its equity and fairness when administered to different gender groups. Overall, the CSI shows good psychometric characteristics, has no systematic DIF between genders, and holds promise to facilitate further research on intimate relationship satisfaction.

Keywords

Couples Satisfaction Index Chinese IRT analysis graded response model differential item functioning

Intimate relationship is a relationship characterized by strong, sustained, mutual influence across a wide range of interactions, typically including lustful desire and the possibility of sexual involvement (Bradbury & Karney, 2019). Building and maintaining a positive, healthy, and meaningful intimate relationship is one of the key tasks in the early adulthood (Berk, 2018; Erikson, 1963; Gerrig, 2013), and can have a lifelong effect on a person, especially on mental and physical health (Blow et al., 2019; Bradbury & Karney, 2019; Kawachi & Berkman, 2001; Whisman & Baucom, 2012) and other important aspects.

Therefore, it is vital to develop effective and valid psychometric tools to monitor, evaluate, guide, and adjust the intimate relationship satisfaction. To date, many scales and tests have been developed and testified for their excellent performances in intimate relationship studies. Among them are the Dyadic Adjustment Scale (Spanier, 1976), Marital Adjustment Test (Locke & Wallace, 1959), Quality of Marriage Index (Norton, 1983), Relationship Assessment Scale (Hendrick, 1988), Semantic Differential Scale (Karney & Bradbury, 1997), Kansas Marital Satisfaction Scale (Nichols et al., 1983), and Couples Satisfaction Index (Funk & Rogge, 2007). To be mentioned, the Couples Satisfaction Index has been newly developed by incorporating and selecting items from other successful and valid scales abovementioned (Funk & Rogge, 2007). This new scale has been verified for excellent psychometric performances and applied to a large body of literature on intimate relationships (S. M. Johnson, 2019; Karney & Bradbury, 2020; Papp, 2018; Williamson, 2020). However, to our knowledge, it has yet to be adapted to Chinese and validated for its performance, which is one of the main focuses of this study.

Another important purpose of this study is to testify differential item functioning (DIF) between genders. Gender differences exist in many aspects of intimate relationships, which hold a great deal of power and influence (Bradbury & Karney, 2019). These gender differences include, for example, ways of thinking and behaving (Fuller & Riggs, 2021; Messinger et al., 2021; Winstok & Smadar-Dror, 2021), emotion expressing (Bliton et al., 2016; Umberson et al., 2015), mate selection (Fletcher et al., 2014; Jarrett & Anderson, 2023; Xiao & Qian, 2020), aggressiveness (Harrington et al., 2021; O’Connor et al., 2023), and other variables (Horne & Johnson, 2018; Policastro & Finn, 2021). Meta-analyses have revealed consistent average differences between women and men on a wide range of characteristics (Bradbury & Karney, 2019). Therefore, it is vital to validate the equity and equality of the adapted Chinese CSI between genders.

The remainder of this paper is organized as follows. First, we introduce the CSI and provide a review of its research in the Chinese context. Then, we present the purpose, methods, and procedures of this study. Based on this, we provide a detailed description of the results and discussions. Finally, practical recommendations and future directions are discussed based on the results.

The Couples Satisfaction Index

Using item response theory (IRT) techniques, Funk and Rogge (2007) demonstrated that the traditional scales that have been used widely, such as the Marital Adjustment Test (Locke & Wallace, 1959) and the Dyadic Adjustment Scale (Spanier, 1976), provided poor levels of precision in assessing intimate relationship satisfaction, particularly given the length of those scales. Then they constructed an item pool by selecting items from those scales. After that, item- and test-level analyses were carried out to determine which items contribute more when satisfaction was assessed. As a result, the Couples Satisfaction Index (CSI) scale with 32 items was developed. Research findings have been synthesized to show that the CSI scale, despite having fewer items compared to other scales, is capable of providing significantly more information (precision). In addition, from the perspective of classical test theory (CTT), the CSI scale demonstrated excellent internal consistency and convergent validity, as well (Funk & Rogge, 2007).

Due to its excellent and promising psychometric performances (Graham et al., 2011; Mattson et al., 2013; Saavedra et al., 2010; Schlagintweit et al., 2016), the CSI scale has been applied across many domains of intimate relationships. These domains include relationship quality (Bareket et al., 2018; McDaniel et al., 2018; Williamson, 2020), relationship maintenance (Doss et al., 2019; Halford & Bodenmann, 2013; Karney & Bradbury, 2020), couple therapy (S. M. Johnson, 2019; Patterson et al., 2018), intimate conflict and violence (Papp, 2018; Visschers et al., 2017), and so on.

In order to facilitate and advance the area of intimate relationship, researchers have adapted the CSI scale into several versions under different languages worldwide. For instance, to explore the association between intimate relationships and other important factors, several studies have been conducted using translated versions of the CSI scale (Glowacka et al., 2018; Kirchner-Häusler et al., 2022; Pfaff & Schlarb, 2022; Pinto et al., 2019; Qadir et al., 2013; Rauch-Anderegg et al., 2020). However, they reported only a simple reliability index and did not systematically evaluate the psychometric properties of the adapted CSI scale, which limits its generalizability to other studies. Therefore, systematic validation of the translated CSI scale is an important research endeavor. Some researchers have done this work already under several language systems (El Frenn et al., 2022; Lamela et al., 2020; Okhotnikov & Wood, 2020). Nonetheless, to the best of our knowledge, the CSI scale has yet to be adapted to Chinese and validated. Verifying the performance of the CSI in the Chinese population is of great value and significance because Chinese is one of the most widely spoken languages in the world.

As a part of validation, it is common to compare and contrast responses from participants with different backgrounds (e.g., socioeconomic status and gender). If the participants from different demographic groups who possess the same level of latent construct of interest have unequal probabilities of giving a certain response to an item, the equity issue arises. This is called differential item functioning (DIF). The DIF analysis is a vital procedure in the processes of scale development and validation. Unfortunately, DIF has not been investigated for the CSI scale since its original development (Funk & Rogge, 2007), which renders its equity and effectiveness questionable under circumstances where subgroups with different demographic characteristics exist.

The Chinese Context

Intimate relationship studies with the CSI scale in Chinese language mainly entail two domains: intercultural and intracultural comparisons and the associations between intimate relationship and other factors.

Regarding the intercultural and intracultural comparisons, Kaya et al. (2019) examined Chinese and Western ethnic identification and relationship satisfaction in four cultural combinations of couples: Western male-Western female, Chinese male-Chinese female, Western male-Chinese female, and Chinese male-Western female. They found that greater similarity between partners on ethnic identification with the majority Western culture of Australia predicted greater relationship satisfaction, but there was no association of relationship satisfaction with partner similarity on Chinese ethnic identification. Hiew et al. (2015), Kim et al. (2012), and Parung and Ferreira (2017) hold similar views that intimate relationship patterns vary across countries and culture. On the other hand, Halford, Leung, et al. (2018) explored the association of relationship standards with relationship satisfaction, by comparing intercultural couples with two groups of mono-cultural couples across two countries of residence (China and Australia). They found that endorsement of couple bond standards and partner similarity on family responsibility standards were associated with relationship satisfaction across all three groups and both countries of residence. Then they concluded that the association of family responsibility standards with satisfaction is remarkably similar across countries of residence and cultural groups. Halford, Lee, Hiew, and van de Vijver (2018) hold similar views that intimate relationship satisfaction shares some common patterns and relations with other aspects across different culture.

Surrounding factors related to intimate relationship, with the assistance of the CSI, were studied around the topics of stress and depression (Zhang et al., 2022), disease (Yeung et al., 2020), baby nurturing (Xue et al., 2018), family functioning (Anderson et al., 2014; Deitz et al., 2015), and personal character (M. D. Johnson et al., 2015). Nevertheless, there is still lack of in-depth research under the Chinese culture. Thus, validation of the CSI scale in the Chinese population is of great importance.

Present Study

Since its initial development, the CSI scale has been verified and validated worldwide (El Frenn et al., 2022; Lamela et al., 2020), and has been applied to various empirical practices (Kirchner-Häusler et al., 2022; Pfaff & Schlarb, 2022). Nevertheless, there still lacks systematic evidence of its performance for the Chinese. In addition, DIF needs to be investigated (Funk & Rogge, 2007), as severe DIF problems can jeopardize the validity and equity of the scale. Therefore, this study aims to validate the performance of the Chinese version of the CSI scale, and conduct the DIF analysis between genders under an IRT framework.

Method

Participants and Sample Selection

In order to validate the Chinese version of the CSI, 762 adults were recruited and asked to respond to the scale online in 2020. Of the original 762 cases, 22 were excluded as they failed to completing the demographic information. After the exclusion, the sample size used for IRT and other analyses was 740 (235 males, 505 females).

Review and approval for the study procedures were obtained from the university institutional review board prior to study onset. Before answering the scale, participants received an information sheet online that assured them that the data obtained would be handled confidentially and anonymously, and they were asked to give electronic informed consent. After filling the scale, they would receive an electronic red packet with a random amount of money.

Measure

Demographic Characteristics

Participants completed a demographic questionnaire including gender, age, and duration of intimate relationship.

The Couples Satisfaction Index (CSI)

The CSI scale (Funk & Rogge, 2007) has 32 items, of which 31 items are rated on a 6-point Likert-type scale from 0 (low satisfaction) to 5 (high satisfaction), and one item (item 1) is rated on a 7-point Likert-type scale from 0 (low satisfaction) to 6 (high satisfaction). Taking item 1 as an example, the question is “please indicate the degree of happiness, all things considered, of your relationship.” In addition, for items 26 to 32, participants rate semantic differential scale with bipolar adjectives on either end (e.g., enjoyable vs. miserable for item 32). Six items (items 6, 10, 15, 27, 29, and 31) are reversely scored such that lower scores reflect more positive attitudes toward intimate relationship satisfaction. For instance, the question for item 6 is “how often do you wish you hadn’t gotten into this relationship.” A score of 0 in this item represents high satisfaction. The CSI scale has been verified to have excellent psychometric properties, such as higher precision of measurement and strong convergent validity (e.g., Graham et al., 2011; Mattson et al., 2013; Saavedra et al., 2010; Schlagintweit et al., 2016). In this study, the CSI scores demonstrated an internal consistency estimate of Cronbach’s α of .97 in the total sample.

In order to validate the performance of the Chinese adapted CSI, the English version was first translated into Chinese following the five-stage procedures proposed by Brislin (1970) and Beaton et al. (2000): initial translation, synthesis of the translations, back translation, expert committee, and test of the prefinal version. The prefinal version was administered to 170 college students and data were analyzed. Results of the analysis showed high discrimination, reliability, and validity, and thus no further revisions were made. Because IRT analyses are the focus of this study, the results for the prefinal version are not provided here, but they are available upon request.

The items of the final translation in Chinese used in this study can be found in the Supplementary Material.

Statistical Analyses

To accomplish the objectives of this study, psychometric techniques under the IRT framework were used. IRT is a system of models that defines the correspondence between latent variables and their manifestations, and it uses latent characterization of individuals and items as predictors of observed responses (De Ayala, 2013). Specifically, two key assumptions in IRT (i.e., unidimensionality and local independence) as well as the goodness of fit of the graded response (GR) model were evaluated first. After that, item parameter estimates and their characteristics were examined. Finally, a DIF analysis was performed using gender to define subgroups. Grounded on the aforementioned, further analyses were conducted to inspect the category response curves and information functions as a measure of the overall performance of the CSI scale and the items in it. All the analyses were executed using several packages in R (R Core Team, 2019).

Unidimensionality

Item response models in which a single dominant ability is presumed sufficient to explain or account for examinee performance are referred to as unidimensional models (Embretson & Reise, 2013; Hambleton et al., 1991). To assess it, exploratory factor analysis (EFA) was executed with the fa function in the psych package (Revelle, 2016) prior to carrying out IRT analyses (Acevedo-Mesa et al., 2021; Eichenbaum et al., 2019). If the first factor accounts for more than 20% of the variance, the scale can be said to be unidimensional (Hattie, 1985; Reckase, 1979).

Local independence

According to Embretson and Reise (2013), responses to items are independent conditional on the level of the latent construct (i.e., the intimate relationship satisfaction in this study), which is referred to as local independence between items in the CSI scale. To check this assumption, Yen’s $Q_{3}$ (1984, 1993) was calculated using the residuals function in the mirt package (Chalmers, 2012). $Q_{3}$ is the correlation between the residuals for a pair of items. The residual for an item is the difference between an individual’s observed response and his or her expected response on the item (De Ayala, 2013). Specifically, regarding Yen’s $Q_{3}$ , common practice is to use a constant cut-point of 0.2 to determine the presence of local dependence (LD) (Chen & Thissen, 1997); however this value tends to result in low power and is likely to underestimate the level of LD between items (Houts & Edwards, 2013). In line with the recommendations of Christensen et al. (2017), any residual correlation $> 0.2$ above the average correlation is considered to indicate LD; and it is unlikely to observe a residual correlation $> 0.3$ above the average.

Graded Response Model and Model Fit

Items for assessing the participants’ attitudes with more than two ordered response categories can be fitted with GR model (Samejima, 1968). The probability of a response in category $k$ or higher, given a participant’s underlying trait, $θ_{i}$ , can be written as follows:

P (θ_{i}; a_{j}, b_{j k}) = \frac{1}{1 + e^{- a_{j} (θ_{i} - b_{j k})}},

(1)

where $a_{j}$ is the discrimination parameter and $b_{j k}$ is the category boundary location for the category $k$ of item $j$ . The discrimination parameter indicates how well an item distinguishes accurately between participants with different levels of relationship satisfaction. The category boundary location represents the underlying trait where participants have a 50% probability of responding in or above a particular category of an item.

Based on the guidelines from Baker and Kim (2017), the $a$ parameters of less than 0.64, between 0.65 and 1.34, between 1.35 and 1.69, and larger than 1.70 indicate low, moderate, high, and very high discrimination, respectively.

To verify that the GR model fits the data, $S - χ^{2}$ were computed. The $S - χ^{2}$ test is an item specific test that evaluates the correspondence of the observed and expected frequencies (Kang & Chen, 2008; Ranger & Brauer, 2022). A non-significant $S - χ^{2}$ ( $p > 0.01$ ) indicates a good fit (Orlando & Thissen, 2000; Stone & Zhang, 2003).

These analyses were performed by the mirt and itemfit functions in the mirt package (Chalmers, 2012).

DIF Analysis

DIF tests were analyzed to identify discrepancies in responses between participants with different gender. Three nested models were estimated for each item, of which the first model only included the trait score for the CSI, the second model included the trait score and gender, and the third model included the trait score, gender, and their interaction. If there is a statistically significant difference between the first and third models, DIF effect is present. If there is a statistically significant difference between the first and second models, uniform DIF effect is present. If there is a statistically significant difference between the second and third models, nonuniform DIF effect is present (Choi et al., 2011). As the chi-square difference test was oversensitive to sample size and might detect negligible effects that bear no practical significance, the change of McFadden’s pseudo $R^{2}$ larger than .035 was defined as the indicator of significant DIF (Choi et al., 2011; Jodoin & Gierl, 2001; Meade et al., 2008) in this study.

These analyses were executed using the lordif function in the lordif package (Choi et al., 2011).

Category Response Curves and Information Functions

The category response curves from the GR model represent the probability of a response in category $k$ or higher, given a participant’s underlying trait. Another important concept used in this study is the information function. Specifically, the higher test information values, the more precise the estimated person parameters. The test information is simply a sum of the item information functions at $θ_{i}$ , and it indicates which levels of $θ_{i}$ are most accurately measured (Hambleton et al., 1991). Test information of 5 and 10 from IRT is approximately equivalent to reliability of 0.8 and 0.9 from CTT (Wainer et al., 2000).

Results

Descriptive Statistics and Preliminary Analyses

Table 1 shows descriptive statistics for age, duration of intimate relationship, and the CSI scores, grouped by gender. From the 740 participants, 62.70% were younger than 25 years old, 35.81% were 26 to 40 years old, and 1.49% were older than 41 years old. Regarding the duration of intimate relationship, 18.11% were shorter than 3 months, 13.11% were 4 to 6 months, 15.54% were 7 months to 1 year, 27.70% were 1 year to 3 years, and 25.54% were longer than 3 years. All Cronbach’s α values from CTT for three samples were larger than .95, indicating good reliability for the Chinese version CSI.

Table 1.

Descriptive Statistics for the Sample.

	Total	Male	Female
N (%)	740 (100.00%)	235 (31.76%)	505 (68.24%)
Age
Range	15–49	15–49	18–49
M (SD)	24.70 (4.83)	24.73 (4.95)	24.70 (4.77)
Duration of relationship (month)
Range	0–360	0–360	0.5–265
M (SD)	30.37 (38.59)	28.31 (42.08)	31.32 (36.88)
The CSI scale
Range	37–193	37–193	49–193
M (SD)	137.29 (33.40)	138.40 (31.70)	136.78 (34.18)
Skew	−0.32	−0.35	−0.30
Kurtosis	−0.67	−0.33	−0.82
Cronbach’s α	.97	.96	.97

Note. CSI = Couples Satisfaction Index.

Assessing Model Assumptions and Fit

EFA with a one-factor solution explained 49.5% of the total variance, exceeding the 20% minimum, which supports the unidimensionality assumption. Regarding the local independence assumption, of the total 496 residuals from Yen’s $Q^{3}$ , the majority did not exceed the cut-point of 0.2, except that 20 values were larger than 0.2 but smaller than 0.3. These results can be found in Supplemental Table S1.

Table 2 shows results from the goodness-of-fit analysis. It can be seen that all items showed excellent goodness of fit with respect to $S - χ^{2}$ , except that item 19 had a significant $S - χ^{2}$ with the p value less than .01.

Table 2.

Fit Statistics of the CSI Items.

Item	$S - χ^{2}$	p	Item	$S - χ^{2}$	p
1	161.35	.15	17	190.78	.03
2	208.33	.48	18	211.59	.57
3	182.19	.52	19	253.19	.00
4	189.07	.42	20	255.53	.03
5	162.58	.13	21	169.41	.33
6	180.02	.72	22	196.11	.02
7	212.99	.22	23	190.34	.54
8	226.86	.49	24	122.69	.54
9	197.91	.45	25	119.32	.50
10	269.56	.07	26	141.56	.57
11	133.13	.21	27	236.20	.04
12	150.04	.06	28	146.62	.65
13	238.61	.88	29	233.63	.07
14	269.00	.03	30	124.23	.92
15	257.79	.06	31	199.01	.86
16	205.48	.67	32	138.93	.49

Note. Boldfaced value denotes a significant $S - χ^{2}$ .

Graded Response Model Parameters

Table 3 shows the estimated item parameters and their standard errors (SE) for the GR model. The discrimination parameter estimates ranged from 0.69 to 4.01.The numbers of items of low, moderate, high, and very high discrimination were 0, 3 (items 6, 10, and 15), 5 (items 2, 3, 13, 14, and 20), and 24 (other items), respectively. The category boundary locations spread across the whole ability scale ranging from −2.83 to −1.13, −1.99 to −0.67, −1.53 to −0.14, −0.86 to 0.64, and −0.07 to 1.67, for $b_{1}$ to $b_{5}$ , respectively. They exhibited a little shift towards the direction of low intimate relationship satisfaction, indicating that participants get satisfied with their relationships easily in the CSI scale. In general, as the GR model presumes, the category boundary locations were allocated with an ascending order in each item. This means that participants with more satisfied intimate relationship were likely to choose higher scores on the CSI scale. In addition, all items were accompanied with SEs around 0.10.

Table 3.

GR Model Parameters for Each Item.

Item	$a$ (SE)	$b_{1}$ (SE)	$b_{2}$ (SE)	$b_{3}$ (SE)	$b_{4}$ (SE)	$b_{5}$ (SE)	$b_{6}$ (SE)
1	2.85 (0.16)	−2.51 (0.16)	−1.99 (0.11)	−1.53 (0.08)	−0.86 (0.06)	−0.07 (0.05)	0.76 (0.06)
2	1.40 (0.10)	−2.82 (0.20)	−1.96 (0.14)	−1.01 (0.09)	0.07 (0.07)	1.37 (0.11)
3	1.61 (0.11)	−2.73 (0.19)	−1.99 (0.13)	−1.20 (0.09)	−0.27 (0.07)	0.96 (0.09)
4	1.71 (0.11)	−2.83 (0.19)	−1.79 (0.12)	−1.00 (0.08)	−0.04 (0.06)	1.14 (0.09)
5	3.13 (0.17)	−1.97 (0.10)	−1.34 (0.07)	−0.81 (0.06)	−0.10 (0.05)	0.70 (0.06)
6	1.21 (0.10)	−2.32 (0.19)	−1.77 (0.15)	−1.34 (0.12)	−0.77 (0.09)	0.09 (0.08)
7	2.11 (0.13)	−1.92 (0.11)	−1.24 (0.08)	−0.64 (0.06)	−0.07 (0.06)	0.72 (0.07)
8	2.15 (0.13)	−1.41 (0.09)	−0.89 (0.07)	−0.45 (0.06)	0.00 (0.06)	0.63 (0.07)
9	2.64 (0.15)	−1.13 (0.07)	−0.74 (0.06)	−0.46 (0.05)	−0.09 (0.05)	0.52 (0.06)
10	0.69 (0.08)	−2.77 (0.32)	−1.56 (0.20)	−0.64 (0.13)	0.27 (0.12)	1.36 (0.19)
11	3.56 (0.20)	−2.12 (0.11)	−1.49 (0.08)	−0.93 (0.06)	−0.31 (0.05)	0.58 (0.05)
12	3.40 (0.19)	−2.09 (0.11)	−1.52 (0.08)	−1.00 (0.06)	−0.38 (0.05)	0.47 (0.05)
13	1.60 (0.10)	−1.51 (0.11)	−0.67 (0.07)	−0.14 (0.06)	0.44 (0.07)	1.03 (0.09)
14	1.48 (0.10)	−2.49 (0.18)	−1.49 (0.11)	−0.75 (0.08)	−0.05 (0.07)	0.94 (0.09)
15	0.86 (0.09)	−2.82 (0.29)	−1.95 (0.21)	−1.15 (0.14)	−0.49 (0.11)	0.52 (0.11)
16	1.83 (0.11)	−1.58 (0.10)	−0.78 (0.07)	−0.14 (0.06)	0.64 (0.07)	1.67 (0.11)
17	2.89 (0.16)	−1.57 (0.08)	−1.03 (0.06)	−0.46 (0.05)	0.15 (0.05)	0.89 (0.06)
18	2.33 (0.13)	−1.44 (0.09)	−0.82 (0.06)	−0.25 (0.06)	0.29 (0.06)	0.91 (0.07)
19	1.73 (0.12)	−2.69 (0.19)	−1.99 (0.14)	−1.26 (0.09)	−0.59 (0.07)	0.29 (0.07)
20	1.57 (0.10)	−2.40 (0.16)	−1.55 (0.11)	−0.68 (0.07)	0.21 (0.07)	1.18 (0.10)
21	2.22 (0.14)	−2.26 (0.14)	−1.71 (0.10)	−1.04 (0.07)	−0.43 (0.06)	0.49 (0.06)
22	2.66 (0.15)	−2.04 (0.11)	−1.37 (0.08)	−0.68 (0.06)	0.00 (0.05)	0.87 (0.07)
23	2.62 (0.14)	−1.32 (0.08)	−0.78 (0.06)	−0.23 (0.05)	0.34 (0.06)	1.07 (0.07)
24	4.01 (0.22)	−1.73 (0.08)	−1.14 (0.06)	−0.71 (0.05)	−0.12 (0.05)	0.57 (0.05)
25	3.22 (0.18)	−2.10 (0.11)	−1.59 (0.08)	−0.83 (0.06)	0.05 (0.05)	1.04 (0.07)
26	2.47 (0.14)	−2.21 (0.13)	−1.66 (0.09)	−0.94 (0.07)	−0.11 (0.05)	0.78 (0.07)
27	1.87 (0.12)	−2.34 (0.16)	−1.65 (0.11)	−0.97 (0.07)	−0.35 (0.06)	0.54 (0.07)
28	2.51 (0.14)	−2.18 (0.12)	−1.54 (0.09)	−0.81 (0.06)	0.00 (0.05)	0.87 (0.07)
29	1.89 (0.12)	−2.28 (0.15)	−1.41 (0.09)	−0.77 (0.07)	−0.18 (0.06)	0.75 (0.07)
30	3.04 (0.17)	−1.86 (0.10)	−1.24 (0.07)	−0.71 (0.06)	−0.06 (0.05)	0.72 (0.06)
31	1.97 (0.12)	−1.76 (0.11)	−0.97 (0.07)	−0.43 (0.06)	0.15 (0.06)	0.95 (0.08)
32	2.68 (0.15)	−1.90 (0.10)	−1.61 (0.09)	−0.99 (0.06)	−0.20 (0.05)	0.62 (0.06)
Min	0.69	−2.83	−1.99	−1.53	−0.86	−0.07	—
Max	4.01	−1.13	−0.67	−0.14	0.64	1.67	—
M	2.25	−2.10	−1.41	−0.78	−0.09	0.79	—
SD	0.79	0.48	0.41	0.34	0.33	0.36	—

Note. Of the 32 items from the CSI scale, only item 1 was rated on a 7-point Likert scale with six category boundary locations, and the others were rated on a 6-point Likert scale with five category boundary locations.

DIF Analysis

Results for the DIF analysis across two gender samples are shown in Table 4. The second and third columns in the table are the mean and standard deviation for the raw score of each item. According to the change of McFadden’s pseudo $R^{2}$ , no items were flagged as having significant DIF (all pseudo $R^{2} < 0.035$ ). In other words, no items from the CSI scale functioned differentially between male and female. The item parameter estimates obtained separately from the two gender samples were also examined. The estimated item parameters from the two samples and their differences are displayed in Supplemental Tables S2 and S3, respectively. It can be observed that the estimated item parameters were very similar between the two gender samples. Correlations of item parameter estimates between the two subsamples were .89, .66, .71, .85, .82, and .68 for $a$ , $b_{1}$ , $b_{2}$ , $b_{3}$ , $b_{4}$ , and $b_{5}$ , respectively.

Table 4.

Differential Item Functioning (DIF) Results for Each Item.

Item	Male sample	Female sample	DIF	Uniform DIF	Nonuniform DIF
Item	M (SD)	M (SD)	ΔR²: Step 1 vs. Step 3	ΔR²: Step 1 vs. Step 2	ΔR²: Step 2 vs. Step 3
1	5.46 (1.44)	5.38 (1.37)	.00	.00	.00
2	4.24 (1.34)	4.27 (1.33)	.00	.00	.00
3	4.51 (1.28)	4.50 (1.32)	.00	.00	.00
4	4.37 (1.26)	4.34 (1.32)	.00	.00	.00
5	4.34 (1.42)	4.40 (1.39)	.00	.00	.00
6	4.48 (1.79)	4.78 (1.59)	.00	.00	.00
7	4.31 (1.52)	4.20 (1.61)	.00	.00	.00
8	4.25 (1.74)	3.98 (1.79)	.00	.00	.00
9	3.83 (1.94)	4.08 (1.88)	.01	.01	.00
10	3.80 (1.83)	3.95 (1.81)	.00	.00	.00
11	4.56 (1.35)	4.57 (1.30)	.00	.00	.00
12	4.69 (1.29)	4.66 (1.33)	.00	.00	.00
13	3.81 (1.77)	3.56 (1.79)	.00	.00	.00
14	4.28 (1.49)	4.26 (1.53)	.00	.00	.00
15	4.17 (1.78)	4.50 (1.70)	.01	.00	.01
16	3.94 (1.48)	3.37 (1.56)	.01	.01	.00
17	4.17 (1.49)	3.95 (1.60)	.00	.00	.00
18	4.03 (1.59)	3.69 (1.73)	.00	.00	.00
19	4.73 (1.33)	4.89 (1.32)	.00	.00	.00
20	4.11 (1.46)	4.11 (1.43)	.00	.00	.00
21	4.91 (1.26)	4.58 (1.38)	.01	.01	.00
22	4.37 (1.42)	4.23 (1.39)	.00	.00	.00
23	3.84 (1.65)	3.63 (1.68)	.00	.00	.00
24	4.43 (1.48)	4.32 (1.51)	.00	.00	.00
25	4.34 (1.34)	4.29 (1.20)	.00	.00	.00
26	4.57 (1.34)	4.38 (1.31)	.00	.00	.00
27	4.40 (1.54)	4.54 (1.40)	.00	.00	.00
28	4.43 (1.38)	4.27 (1.34)	.00	.00	.00
29	4.32 (1.50)	4.32 (1.49)	.00	.00	.00
30	4.32 (1.54)	4.30 (1.42)	.00	.00	.00
31	3.98 (1.67)	3.92 (1.61)	.00	.00	.00
32	4.44 (1.49)	4.54 (1.33)	.00	.00	.00

Category Response Curves

Figure 1 shows the category response curves for each item. Most items had satisfactory category response curves, with each category displaying discrimination ability to some extent. However, for items 6 (row 1, column 6), 10 (row 2, column 3), and 15 (row 3, column 1), response curves for categories 2, 3, 4, and 5 were nearly coincided and intertwined with each other. This indicated that the three items failed to utilize all six response options (categories); that is, four response categories, except categories 1 and 6, provided little information for differentiating participants with various levels of intimate relationship satisfaction. Accordingly, the abovementioned imperfect performances were in line with their moderate discrimination values found in the GR model parameters in Table 3. Nevertheless, the first and sixth category response curves in these three items (items 6, 10, and 15) differentiated well among individuals located at different points and covered the whole ability scale.

Figure 1.

Category response curves.

Item and Test Information Functions

Figure 2 shows the item information curves for the CSI scale. Similar to category response curves, most items seem to provide enough information across the whole latent trait of intimate relationship satisfaction, except for items 10 (row 2, column 3) and 15 (row 3, column 1) whose information curves were flat and close to 0 across the entire range of the latent scale.

Figure 2.

Item information curves.

Figure 3 shows the test information curve for assessing the intimate relationship satisfaction. The CSI scale provided good information across the lower end and middle of the latent trait, indicating that the CSI scale accurately and reliably produces information about participants located near that range. Especially, when the CSI scale was administered to participants whose ability is less than 1.9 or 2.4 (see the two vertical dotted lines at those values in Figure 3), test information exceeds 5 and 10 (equivalent to the traditional reliabilities of 0.80 and 0.90), respectively. The scale provided the largest test information (i.e., highest reliability and accuracy) around $θ = - 1.00$ .

Figure 3.

Test information curve.

Discussion

The current study used IRT techniques to validate the Chinese version of the CSI scale and investigate potential DIF across two gender samples. Generally speaking, the adapted CSI scale showed satisfactory psychometric statistics, in the aspects of assumption checking, GRM fitting, response curves and information, and DIF analysis. Moreover, several findings deserve more attention and discussion.

IRT Assumptions

Two key IRT assumptions, unidimensionality and local independence, were checked in this study. No strong evidence of violation of the two assumptions was observed. Specifically, the EFA results clearly showed that the CSI scale was unidimensional, which is in line with the conclusions from Funk and Rogge (2007) and other studies (Lamela et al., 2020; Okhotnikov & Wood, 2020; Qadir et al., 2013).

For the local independence assumption, Yen’s $Q_{3}$ statistics revealed slight violation of the assumption for some item pairs. To be specific, there were slight LD between items 11 and 12, items 18 and 21, items 26 and 28, and items 27 and 29. In order to clarify and find out what was causing the LD for those item pairs, more analyses at the levels of item content and response formats were carried out. The original sentences for items 11 and 12 were “my relationship with my partner makes me happy” and “I have a warm and comfortable relationship with my partner,” respectively. Similarity between their item and response formats, content, knowledge, and abilities, might have led to participants’ answering these items in the same manner (Christensen et al., 2017; Jiao et al., 2012; Yen, 1993). And the correlation (.81) between items 11 and 12 also verified this assumption, which can be found in Supplemental Table S4. The Yen’s $Q_{3}$ statistic between items 18 and 21 could be explained with the same logic.

The two sets of adjectives for items 26 and 28 in semantic differential format are “boring vs. interesting” and “empty vs. full,” respectively. The semantic differential formats between these two items shared the same scoring rubrics, which might cause the LD between them (Baghaei & Aryadoust, 2015; Jiao et al., 2012; Yen, 1993). The LD between items 27 and 29 can be explained with this logic as well. To be mentioned, both of them used the reverse-scored semantic differential scale, which might cause the LD and other identification problems (Eichenbaum et al., 2019).

On the other hand, it was found that a single critical value for the $Q_{3}$ statistics was not appropriate for all situations, as the range of residual correlation values was influenced by the number of factors, such as the number of items and response categories (Christensen et al., 2017). Consequently, the cut-point is not the golden rule for detecting local independence and deciding the severity of its effect on item functioning. Other statistics and indices, for instance, characteristic curves and information curves, should be considered simultaneously when evaluating the performance of the CSI scale and items.

Overall, it could be concluded that the two IRT key assumptions were hold reasonably well. Nevertheless, more evidence on local independence assumption should be gathered in further studies.

Graded Response Model

After fitting the GR model, all items from the CSI scale showed parameter estimates (i.e., $a_{j}$ s and $b_{j k}$ s) with low standard errors, indicating that the estimation results were acceptable and accurate. However, the category response curves for items 6, 10, and 15 were somewhat unsatisfactory, where the intermediate categories intertwined with each other and played little role in distinguishing participants with various levels of intimate relationship satisfaction. In other words, if these items were dichotomized from six categories into two categories, they may function the same way. The item information curves in Figure 2 also exhibited similar trends that these three items provided less information for assessing abilities compared with other items. Nonetheless, when viewing all 32 items collectively, the test information curve for the CSI scale captured a meaningful level of information, as shown in Figure 3. It implied that other items compensated these three items for the lower information they provided, and the whole CSI scale still functioned well as a whole. Similar as Funk and Rogge (2007) justified in their original version of the CSI, three slightly less informative items from the Dyadic Adjustment Scale (Spanier, 1976) were retained, as they were some of the only items that provided information at the highest levels of relationship satisfaction.

Differential Item Functioning

This study also aims to verify whether DIF exists in the CSI scale between two gender samples. As the results indicated, no significant DIF was detected, which ensured the equity and validity of the CSI scale. It is important to note that no DIF between gender samples does not indicate little or no gender differences exist for intimate relationship. Conversely, men and women are distinct from each other in many aspects of intimate relationship (Caldwell et al., 2012; Hamby, 2014; Mark & Murray, 2012), while also maintaining some similarities (Fagan & Wright, 2011; Larsen et al., 2011; Romito et al., 2013). On the other hand, as Bradbury and Karney (2014) stated, “our sex consistently accounts for how we think about intimacy, pursue and maintain intimacy, repair rifts in our intimate relationships, and respond when intimacy is threatened or lost.” As an effective scale without DIF, the CSI scale can be readily used in the studies about the similarities and discrepancies of intimate relationship between genders.

Conclusion

The Couples Satisfaction Index (CSI) is widely used in intimate relationships assessment. However, there is no research about adapting it to the Chinese context. This study employs IRT techniques to validate the revised Chinese CSI and examine its fairness across different gender groups. The statistical analyses include unidimensionality, local independence, fit of the GR model, DIF, category response curves and information functions. Overall, the adapted Chinese version of the CSI scale showed satisfactory psychometric performances and exhibited no systematic DIF between genders. It is expected that results of this study will facilitate the advancement of intimate relationship studies.

Limitations should be taken into consideration when interpreting the results from this study. First, although we tried to recruit as many participants as possible, the sample used in this study was not large and representative enough to generalize results over other age groups, ethnic groups and the Chinese living in other cultures, for example, Chinese Americans. Second, with respect to the results from IRT analyses, there were a few cases where the performance of the CSI scale was not completely satisfactory, which calls for more research on gathering further validation evidence from various other applications and statistical techniques. Among them are, for example, checking the unidimensionality assumption using confirmatory factor analysis (CFA), revising or removing items to alleviate the dependence between several items, and improving the scale performances when applied to participants with high or low satisfaction.

Supplemental Material

sj-docx-1-sgo-10.1177_21582440241271087 – Supplemental material for The Chinese Version of the Couples Satisfaction Index: Psychometric Assessment and Differential Item Functioning Analysis with Item Response Theory

Supplemental material, sj-docx-1-sgo-10.1177_21582440241271087 for The Chinese Version of the Couples Satisfaction Index: Psychometric Assessment and Differential Item Functioning Analysis with Item Response Theory by Shaojie Wang, Won-Chan Lee and Huixia Ma in SAGE Open

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Ethical Approval

ORCID iD

Shaojie Wang

Data Availability Statement

Data are available upon request.

Supplemental Material

Supplemental material for this article is available online.

References

Acevedo-Mesa

Tendeiro

J. N.

Roest

Rosmalen

J. G.

Monden

(2021). Improving the measurement of functional somatic symptoms with item response theory. Assessment, 28(8), 1960–1970. https://doi.org/10.1177/1073191120947153

Anderson

J. R.

Johnson

M. D.

Liu

Zheng

Hardy

N. R.

Lindstrom

R. A.

(2014). Young adult romantic relationships in Mainland China: Perceptions of family of origin functioning are directly and indirectly associated with relationship success. Journal of Social and Personal Relationships, 31(7), 871–887. https://doi.org/10.1177/0265407513508727

Baghaei

Aryadoust

(2015). Modeling local item dependence due to common test format with a multidimensional Rasch model. International Journal of Testing, 15(1), 71–87. https://doi.org/10.1080/15305058.2014.941108

Baker

F. B.

Kim

S. H.

(2017). The basics of item response theory using R. Springer. https://doi.org/10.1007/978-3-319-54205-8

Bareket

Kahalon

Shnabel

Glick

(2018). The Madonna-Whore Dichotomy: Men who perceive women’s nurturance and sexuality as mutually exclusive endorse patriarchy and show lower relationship satisfaction. Sex Roles, 79(9), 519–532. https://doi.org/10.1007/s11199-018-0895-7

Beaton

D. E.

Bombardier

Guillemin

Ferraz

M. B.

(2000). Guidelines for the process of cross-cultural adaptation of self-report measures. Spine, 25(24), 3186–3191. https://doi.org/10.1097/00007632-200012150-00014

Berk

L. E.

(2018). Development through the lifespan (7th ed.). Pearson Education.

Bliton

C. F.

Wolford-Clevenger

Zapor

Elmquist

Brem

M. J.

Shorey

R. C.

Stuart

G. L.

(2016). Emotion dysregulation, gender, and intimate partner violence perpetration: An exploratory study in college students. Journal of Family Violence, 31, 371–377. https://doi.org/10.1007/s10896-015-9772-0

Blow

A. J.

Farero

Ganoczy

Walters

Valenstein

(2019). Intimate relationships buffer suicidality in national guard service members: A longitudinal study. Suicide and Life-Threatening Behavior, 49(6), 1523–1540. https://doi.org/10.1111/sltb.12537

10.

Bradbury

T. N.

Karney

B. R.

(2014). Intimate relationships (2nd ed.). W. W. Norton.

11.

Bradbury

T. N.

Karney

B. R.

(2019). Intimate relationships (3rd ed.). W. W. Norton.

12.

Brislin

R. W.

(1970). Back-translation for cross-cultural research. Journal of Cross-Cultural Psychology, 1(3), 185–216. https://doi.org/10.1177/135910457000100301

13.

Caldwell

J. E.

Swan

S. C.

Woodbrown

V. D.

(2012). Gender differences in intimate partner violence outcomes. Psychology of Violence, 2(1), 42. https://doi.org/10.1037/a0026296

14.

Chalmers

R. P.

(2012). Mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29. https://doi.org/10.18637/jss.v048.i06

15.

Chen

W. H.

Thissen

(1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22(3), 265–289. https://doi.org/10.3102/10769986022003265

16.

Choi

S. W.

Gibbons

L. E.

Crane

P. K.

(2011). Lordif: An R package for detecting differential item functioning using iterative hybrid ordinal logistic regression/item response theory and Monte Carlo simulations. Journal of Statistical Software, 39(8), 1. https://doi.org/10.18637/jss.v039.i08

17.

Christensen

K. B.

Makransky

Horton

(2017). Critical values for Yen’s Q3: Identification of local dependence in the Rasch model using residual correlations. Applied Psychological Measurement, 41(3), 178–194. https://doi.org/10.1177/0146621616677520

18.

De Ayala

R. J

. (2013). The theory and practice of item response theory. Guilford Publications.

19.

Deitz

S. L.

Anderson

J. R.

Johnson

M. D.

Hardy

N. R.

Zheng

Liu

(2015). Young romance in China: Effects of family, attachment, relationship confidence, and problem solving. Personal Relationships, 22(2), 243–258. https://doi.org/10.1111/pere.12077

20.

Doss

B. D.

Roddy

M. K.

Nowlan

K. M.

Rothman

Christensen

(2019). Maintenance of gains in relationship and individual functioning following the online OurRelationship program. Behavior Therapy, 50(1), 73–86. https://doi.org/10.1016/j.beth.2018.03.011

21.

Eichenbaum

A. E.

Marcus

D. K.

French

B. F.

(2019). Item response theory analysis of the psychopathic personality inventory-revised. Assessment, 26(6), 1046–1058. https://doi.org/10.1177/1073191117715729

22.

El Frenn

Akel

Hallit

Obeid

. (2022). Couple’s satisfaction among Lebanese adults: Validation of the Toronto Alexithymia scale and Couple Satisfaction Index-4 scales, association with attachment styles and mediating role of alexithymia. BMC Psychology, 10(1), 1–10. https://doi.org/10.1186/s40359-022-00719-6

23.

Embretson

S. E.

Reise

S. P.

(2013). Item response theory. Psychology Press. https://doi.org/10.4324/9781410605269

24.

Erikson

(1963). Childhood and society. Norton.

25.

Fagan

A. A.

Wright

E. M.

(2011). Gender differences in the effects of exposure to intimate partner violence on adolescent violence and drug use. Child Abuse & Neglect, 35(7), 543–550. https://doi.org/10.1016/j.chiabu.2011.05.001

26.

Fletcher

G. J.

Kerr

P. S.

N. P.

Valentine

K. A.

(2014). Predicting romantic interest and decisions in the very early stages of mate selection: Standards, accuracy, and sex differences. Personality and Social Psychology Bulletin, 40(4), 540–550. https://doi.org/10.1177/0146167213519481

27.

Fuller

K. A.

Riggs

D. W.

(2021). Intimate relationship strengths and challenges amongst a sample of transgender people living in the United States. Sexual and Relationship Therapy, 36(4), 399–412. https://doi.org/10.1080/14681994.2019.1679765

28.

Funk

J. L.

Rogge

R. D.

(2007). Testing the ruler with item response theory: Increasing precision of measurement for relationship satisfaction with the Couples Satisfaction Index. Journal of Family Psychology, 21(4), 572–583. https://doi.org/10.1037/0893-3200.21.4.572

29.

Gerrig

R. J.

(2013). Psychology and life (20th ed.). Pearson.

30.

Glowacka

Bergeron

Dubé

Rosen

N. O.

(2018). When self-worth is tied to one’s sexual and romantic relationship: Associations with well-being in couples coping with genito-pelvic pain. Archives of Sexual Behavior, 47(6), 1649–1661. https://doi.org/10.1007/s10508-017-1126-y

31.

Graham

J. M.

Diebels

K. J.

Barnow

Z. B.

(2011). The reliability of relationship satisfaction: A reliability generalization meta-analysis. Journal of Family Psychology, 25(1), 39. https://doi.org/10.1037/a0022441

32.

Halford

W. K.

Bodenmann

(2013). Effects of relationship education on maintenance of couple relationship satisfaction. Clinical Psychology Review, 33(4), 512–525. https://doi.org/10.1016/j.cpr.2013.02.001

33.

Halford

W. K.

Lee

Hiew

D. N.

van de Vijver

F. J.

(2018). Indirect couple communication and relationship satisfaction in Chinese, Western, and Chinese-Western intercultural couples. Couple and Family Psychology: Research and Practice, 7(3–4), 183. https://doi.org/10.1037/cfp0000109

34.

Halford

W. K.

Leung

P. W.

Hung-Cheung

Chau-Wan

Hiew

D. N.

van de Vijver

F. J.

(2018). Relationship standards and relationship satisfaction in Chinese, Western, and intercultural couples living in Australia and Hong Kong, China. Couple and Family Psychology: Research and Practice, 7(3–4), 127. https://doi.org/10.1037/cfp0000104

35.

Hambleton

R. K.

Swaminathan

Rogers

H. J.

(1991). Fundamentals of item response theory. Sage.

36.

Hamby

(2014). Intimate partner and sexual violence research: Scientific progress, scientific challenges, and gender. Trauma, Violence, & Abuse, 15(3), 149–158. https://doi.org/10.1177/1524838014520723

37.

Harrington

A. G.

Overall

N. C.

Cross

E. J.

(2021). Masculine gender role stress, low relationship power, and aggression toward intimate partners. Psychology of Men & Masculinities, 22(1), 48–62. https://doi.org/10.1037/men0000262

38.

Hattie

(1985). Methodology review: Assessing unidimensionality of tests and items. Applied Psychological Measurement, 9(2), 139–164. https://doi.org/10.1177/014662168500900204

39.

Hendrick

S. S.

(1988). A generic measure of relationship satisfaction. Journal of Marriage and the Family, 50(1), 93–98. https://doi.org/10.2307/352430

40.

Hiew

D. N.

Halford

W. K.

Van de Vijver

F. J.

Liu

(2015). Relationship standards and satisfaction in Chinese, Western, and intercultural Chinese-Western couples in Australia. Journal of Cross-Cultural Psychology, 46(5), 684–701. https://doi.org/10.1177/0022022115579936

41.

Horne

R. M.

Johnson

M. D.

(2018). Gender role attitudes, relationship efficacy, and self-disclosure in intimate relationships. The Journal of Social Psychology, 158(1), 37–50. https://doi.org/10.1080/00224545.2017.1297288

42.

Houts

C. R.

Edwards

M. C.

(2013). The performance of local dependence measures with psychological data. Applied Psychological Measurement, 37(7), 541–562. https://doi.org/10.1177/0146621613491456

43.

Jarrett

A. S.

Anderson

R. C.

(2023). Is the grass really greener? The influence of gender identity and sexual orientation on mate copying behaviors. The Journal of Sex Research, 60(3), 418–427. https://doi.org/10.1080/00224499.2022.2078949

44.

Jiao

Kamata

Wang

Jin

(2012). A multilevel testlet model for dual local dependence. Journal of Educational Measurement, 49(1), 82–100. https://doi.org/10.1111/j.1745-3984.2011.00161.x

45.

Jodoin

M. G.

Gierl

M. J.

(2001). Evaluating type I error and power rates using an effect size measure with the logistic regression procedure for DIF detection. Applied Measurement in Education, 14(4), 329–349. https://doi.org/10.1207/S15324818AME1404_2

46.

Johnson

M. D.

Nguyen

Anderson

J. R.

Liu

Vennum

(2015). Shame proneness and intimate relations in Mainland China. Personal Relationships, 22(2), 335–347. https://doi.org/10.1111/pere.12083

47.

Johnson

S. M.

(2019). Attachment theory in practice: Emotionally focused therapy (EFT) with individuals, couples, and families. Guilford Publications.

48.

Kang

Chen

T. T.

(2008). Performance of the generalized S-χ² item fit index for polytomous IRT models. Journal of Educational Measurement, 45(4), 391–406. https://doi.org/10.1111/j.1745-3984.2008.00071.x

49.

Karney

B. R.

Bradbury

T. N.

(1997). Neuroticism, marital interaction, and the trajectory of marital satisfaction. Journal of Personality and Social Psychology, 72(5), 1075. https://doi.org/10.1037/0022-3514.72.5.1075

50.

Karney

B. R.

Bradbury

T. N.

(2020). Research on marital satisfaction and stability in the 2010s: Challenging conventional wisdom. Journal of Marriage and Family, 82(1), 100–116. https://doi.org/10.1111/jomf.12635

51.

Kawachi

Berkman

L. F.

(2001). Social ties and mental health. Journal of Urban Health, 78(3), 458–467. https://doi.org/10.1093/jurban/78.3.458

52.

Kaya

Halford

W. K.

Hiew

D. N.

Sheffield

Van De Vijver

F. J.

(2019). Ethnic identification and relationship satisfaction in Chinese, Western, and intercultural Chinese-Western couples. Couple and Family Psychology: Research and Practice, 8(3), 121. https://doi.org/10.1037/cfp0000120

53.

Kim

Edwards

A. B.

Sweeney

K. A.

Wetchler

J. L.

(2012). The effects of differentiation and attachment on satisfaction and acculturation in Asian-White American international couple relationships: Assessment with Chinese, South Korean, and Japanese partners in relationships with white American partners in the United States. The American Journal of Family Therapy, 40(4), 320–335. https://doi.org/10.1080/01926187.2011.616409

54.

Kirchner-Häusler

Boiger

Uchida

Higuchi

Uchida

Mesquita

(2022). Relatively happy: The role of the positive-to-negative affect ratio in Japanese and Belgian couples. Journal of Cross-Cultural Psychology, 53(1), 66–86. https://doi.org/10.1177/00220221211051016

55.

Lamela

Figueiredo

Morais

Matos

Jongenelen

(2020). Are measures of marital satisfaction valid for women with depressive symptoms? The examination of factor structure and measurement invariance of the Couple Satisfaction Index-4 across depression levels in Portuguese women. Clinical Psychology & Psychotherapy, 27(2), 214–219. https://doi.org/10.1002/cpp.2420

56.

Larsen

C. D.

Sandberg

J. G.

Harper

J. M.

Bean

(2011). The effects of childhood abuse on relationship quality: Gender differences and clinical implications. Family Relations, 60(4), 435–445. https://doi.org/10.1111/j.1741-3729.2011.00661.x

57.

Locke

H. J.

Wallace

K. M.

(1959). Short marital-adjustment and prediction tests: Their reliability and validity. Marriage and Family Living, 21(3), 251–255. https://doi.org/10.2307/348022

58.

Mark

K. P.

Murray

S. H.

(2012). Gender differences in desire discrepancy as a predictor of sexual and relationship satisfaction in a college sample of heterosexual romantic relationships. Journal of Sex & Marital Therapy, 38(2), 198–215. https://doi.org/10.1080/0092623X.2011.606877

59.

Mattson

R. E.

Rogge

R. D.

Johnson

M. D.

Davidson

E. K.

Fincham

F. D.

(2013). The positive and negative semantic dimensions of relationship satisfaction. Personal Relationships, 20(2), 328–355. https://doi.org/10.1111/j.1475-6811.2012.01412.x

60.

McDaniel

B. T.

Galovan

A. M.

Cravens

J. D.

Drouin

(2018). “Technoference” and implications for mothers’ and fathers’ couple and coparenting relationship quality. Computers in Human Behavior, 80, 303–313. https://doi.org/10.1016/j.chb.2017.11.019

61.

Meade

A. W.

Johnson

E. C.

Braddy

P. W.

(2008). Power and sensitivity of alternative fit indices in tests of measurement invariance. Journal of Applied Psychology, 93(3), 568. https://doi.org/10.1037/0021-9010.93.3.568

62.

Messinger

A. M.

Birmingham

R. S.

DeKeseredy

W. S.

(2021). Perceptions of same-gender and different-gender intimate partner cyber-monitoring. Journal of Interpersonal Violence, 36(7–8), NP4315–NP4335. https://doi.org/10.1177/0886260518787814

63.

Nichols

C. W.

Schumm

W. R.

Schectman

K. L.

Grigsby

C. C.

(1983). Characteristics of responses to the Kansas Marital Satisfaction Scale by a sample of 84 married mothers. Psychological Reports, 53(2), 567–572. https://doi.org/10.2466/pr0.1983.53.2.567

64.

Norton

(1983). Measuring marital quality: A critical look at the dependent variable. Journal of Marriage and the Family, 45(1), 141–151. https://doi.org/10.2307/351302

65.

O’Connor

Nikolova

Cardenas

Snyder

(2023). The mediating effect of traditional gender beliefs on the relationship between gender disparities and intimate partner violence perpetration. Journal of Aggression, Maltreatment & Trauma, 32(1–2), 53–70. https://doi.org/10.1080/10926771.2022.2088322

66.

Okhotnikov

I. A.

Wood

N. D.

(2020). Adaptation of the couples satisfaction index into Russian. Contemporary Family Therapy, 42(2), 140–151. https://doi.org/10.1007/s10591-019-09517-6

67.

Orlando

Thissen

(2000). Likelihood-based item-fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24(1), 50–64. https://doi.org/10.1177/01466216000241003

68.

Papp

L. M.

(2018). Topics of marital conflict in the everyday lives of empty nest couples and their implications for conflict resolution. Journal of Couple & Relationship Therapy, 17(1), 7–24. https://doi.org/10.1080/15332691.2017.1302377

69.

Parung

G. E.

Ferreira

(2017). Work-life balance, couple satisfaction, and father involvement: A cross-cultural study. ANIMA Indonesian Psychological Journal, 32(4), 201–216. https://doi.org/10.24123/aipj.v32i4.851

70.

Patterson

Williams

Edwards

T. M.

Chamow

Grauf-Grounds

(2018). Essential skills in family therapy: From the first interview to termination. Guilford Publications.

71.

Pfaff

Schlarb

A. A.

(2022). Child maltreatment and sleep: Two pathways explaining the link. Journal of Sleep Research, 31(2), e13455. https://doi.org/10.1111/jsr.13455

72.

Pinto

Grover

Dhooria

Rathi

Sharma

(2019). Sexual functioning and its correlates in premenopausal married Indian women with systemic lupus erythematosus. International Journal of Rheumatic Diseases, 22(10), 1814–1819. https://doi.org/10.1111/1756-185X.13675

73.

Policastro

Finn

M. A.

(2021). Coercive control in intimate relationships: Differences across age and sex. Journal of Interpersonal Violence, 36(3–4), 1520–1543. https://doi.org/10.1177/0886260517743548

74.

Qadir

Khalid

Haqqani

Medhin

(2013). The association of marital relationship and perceived social support with mental health of women in Pakistan. BMC Public Health, 13(1), 1–13. https://doi.org/10.1186/1471-2458-13-1150

75.

Ranger

Brauer

(2022). On the generalized S-χ² test of item fit: Some variants, residuals, and a graphical visualization. Journal of Educational and Behavioral Statistics, 47(2), 202–230. https://doi.org/10.3102/10769986211050304

76.

Rauch-Anderegg

Kuhn

Milek

Halford

W. K.

Bodenmann

(2020). Relationship behaviors across the transition to parenthood. Journal of Family Issues, 41(4), 483–506. https://doi.org/10.1177/0192513X19878864

77.

R Core Team. (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing.

78.

Reckase

M. D.

(1979). Unifactor latent trait models applied to multifactor tests: Results and implications. Journal of Educational Statistics, 4(3), 207–230. https://doi.org/10.3102/10769986004003207

79.

Revelle

(2016). Psych: Procedures for personality and psychological research. R package Version 1.6.6. http://cran.r-project.org/package=psych.

80.

Romito

Beltramini

Escribà-Agüir

(2013). Intimate partner violence and mental health among Italian adolescents: Gender similarities and differences. Violence Against Women, 19(1), 89–106. https://doi.org/10.1177/1077801212475339

81.

Saavedra

M. C.

Chapman

K. E.

Rogge

R. D.

(2010). Clarifying links between attachment and relationship quality: Hostile conflict and mindfulness as moderators. Journal of Family Psychology, 24(4), 380. https://doi.org/10.1037/a0019872

82.

Samejima

(1968). Estimation of latent ability using a response pattern of graded scores. ETS Research Bulletin Series, 1968(1), 1–169. https://doi.org/10.1002/j.2333-8504.1968.tb00153.x

83.

Schlagintweit

H. E.

Bailey

Rosen

N. O.

(2016). A new baby in the bedroom: Frequency and severity of postpartum sexual concerns and their associations with relationship satisfaction in new parent couples. The Journal of Sexual Medicine, 13(10), 1455–1465. https://doi.org/10.1016/j.jsxm.2016.08.006

84.

Spanier

G. B.

(1976). Measuring dyadic adjustment: New scales for assessing the quality of marriage and similar dyads. Journal of Marriage and the Family, 38(1), 15–28. https://doi.org/10.2307/350547

85.

Stone

C. A.

Zhang

(2003). Assessing goodness of fit of item response theory models: A comparison of traditional and alternative procedures. Journal of Educational Measurement, 40(4), 331–352. https://doi.org/10.1111/j.1745-3984.2003.tb01150.x

86.

Umberson

Thomeer

M. B.

Lodge

A. C.

(2015). Intimacy and emotion work in lesbian, gay, and heterosexual relationships. Journal of Marriage and Family, 77(2), 542–556. https://doi.org/10.1111/jomf.12178

87.

Visschers

Jaspaert

Vervaeke

(2017). Social desirability in intimate partner violence and relationship satisfaction reports: An exploratory analysis. Journal of Interpersonal Violence, 32(9), 1401–1420. https://doi.org/10.1177/0886260515588922

88.

Wainer

Dorans

N. J.

Flaugher

Green

B. F.

Mislevy

R. J.

(2000). Computerized adaptive testing: A primer. Routledge. https://doi.org/10.4324/9781410605931

89.

Whisman

M. A.

Baucom

D. H.

(2012). Intimate relationships and psychopathology. Clinical Child and Family Psychology Review, 15(1), 4–13. https://doi.org/10.1007/s10567-011-0107-2

90.

Williamson

H. C.

(2020). Early effects of the COVID-19 pandemic on relationship satisfaction and attributions. Psychological Science, 31(12), 1479–1487. https://doi.org/10.1177/0956797620972688

91.

Winstok

Smadar-Dror

(2021). Gender, escalatory tendencies, and verbal aggression in intimate relationships. Journal of Interpersonal Violence, 36(11–12), 5383–5400. https://doi.org/10.1177/0886260518805764

92.

Xiao

Qian

(2020). Mate selection among online daters in Shanghai: Why does education matter? Chinese Journal of Sociology, 6(4), 521–546. https://doi.org/10.1177/2057150X20957422

93.

Xue

W. L.

H. G.

Chua

Y. J.

Wang

Shorey

(2018). Factors influencing first-time fathers’ involvement in their wives’ pregnancy and childbirth: A correlational study. Midwifery, 62, 20–28. https://doi.org/10.1016/j.midw.2018.03.002

94.

Yen

W. M.

(1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8(2), 125–145. https://doi.org/10.1177/014662168400800201

95.

Yen

W. M.

(1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30(3), 187–213. https://doi.org/10.1111/j.1745-3984.1993.tb00423.x

96.

Yeung

N. C.

Zhang

(2020). Finding the silver linings: Psychosocial correlates of posttraumatic growth among husbands of Chinese breast cancer survivors. Psycho-Oncology, 29(10), 1646–1654. https://doi.org/10.1002/pon.5484

97.

Zhang

Wang

Creedy

D. K.

(2022). Prevalence of stress and depression and associated factors among women seeking a first-trimester induced abortion in China: A cross-sectional study. Reproductive Health, 19(1), 1–11. https://doi.org/10.1186/s12978-022-01366-1

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.04 MB