Sage Journals: Discover world-class research

Abstract

Many areas of psychological science rely heavily on theoretical constructs, such as personality traits, attitudes, and emotions, and many of these measured constructs are defined by a continuum that represents the different degrees of the attribute. However, these continua are not usually considered by psychologists during the process of scale development and validation. Unfortunately, this can lead to numerous scientific problems, such as incomplete measurement of the construct, difficulties in distinguishing between constructs, and compromised evidence for validity. The purpose of the current article is to propose an approach for carefully considering these issues in psychological measurement. This approach, which we term continuum specification, is a two-stage process in which the researcher defines and then properly operationalizes the target continuum. Defining the continuum involves specifying its polarity (i.e., the meaning of its poles, or ends) and the nature of its gradations (i.e., the quality that separates high from low scores). Operationalizing the continuum means using this definition to develop a measure that (a) sufficiently captures the entire continuum, (b) has appropriate response options, (c) uses correct procedures for assessing dimensionality, and (d) accounts for the underlying response process. These issues have significant implications for psychological measurement.

Keywords

construct validity factor analysis item response theory (IRT)emotions happiness personality attitudes ideal-point models measurement operationalization response format response process

In psychological science, constructs are the core building blocks for empirical and theoretical work. Although there are differing philosophical interpretations of constructs (Bollen, 2002; Borsboom, Mellenbergh, & van Heerden, 2003), a dominant view is that they are abstractions of covarying patterns within the observable world (Ghiselli, 1964; Nunnally, 1978). In view of this, evidence for construct validity (Binning & Barrett, 1989; Messick, 1995) has been heavily emphasized in psychology and related fields (Bagozzi, Yi, & Phillips, 1991; Binning & Barrett, 1989; Clark & Watson, 1995; Cronbach & Meehl, 1955; Messick, 1995).

Although this is frequently overlooked, many constructs themselves are defined by a continuum that represents the varying magnitudes of the attribute of interest (e.g., low, moderate, and high). This unidimensional continuum is the basis of psychological scaling, as described by Thurstone (1928): “We may frankly represent this linearity in the form of a unidimensional scale” (p. 538). The continuum is also the basis of the variability observed across individuals, and accurate measurement entails assigning individuals scores that match their location on this theoretical continuum. Despite the relevance of these continua to psychological measurement, explicit thinking about them has not yet become an established part of developing and validating construct measures. By contrast, standard practice is to define and operationalize psychological constructs without articulating their underlying continua. This is problematic because it leads to untested assumptions that exert influence on virtually all phases of construct validation, from developing operational measures to accumulating evidence of validity. We give two illustrative examples of the possible ill effects of the current standard practice:

Example 1: Bipolarity assessments. Often, psychologists want to understand whether two seemingly opposed constructs are truly bipolar opposites. For instance, there has been a long-standing debate as to whether happiness and sadness are truly bipolar (i.e., a single construct with opposing qualities) or simply bivariate (i.e., two distinct unipolar constructs; see Greenwald, 2012). The standard approach for assessing bipolarity uses the correlation between measures of the two constructs; a strong negative association is taken to indicate bipolarity, and anything else is taken to indicate nonbipolarity (Russell & Carroll, 1999). However, bipolarity assessments are complicated by whether there is overlap in the continua of the two constructs. When these continua are left undefined and not carefully operationalized, clear tests of bipolarity are not possible.

Example 2: Tests of curvilinearity. In another common scenario, psychologists want to examine the scientific hypothesis that the association between a trait and an outcome varies across levels of that trait (i.e., there is a curvilinear relationship). A sufficient test of this hypothesis entails that the full continuum has been adequately measured. That is, the measurements used in the analysis must have adequately captured the range where curvilinear effects are expected to occur (e.g., Grant, 2013). Careful attention needs to be placed on the construct continuum and its measurement in order to address this issue.

These two examples serve to illustrate several key points regarding construct continua. First, they show that each construct is characterized by an underlying continuum whose gradations, or degrees, represent variability in psychological attributes. Second, how these continua are operationalized significantly influences the quality of many scientific inferences that might be drawn. Third, when these continua are left undefined and indeterminate, it is difficult to have clarity on the nomological network. Finally, it is necessary to consider, specify, and operationalize these continua in the broader process of psychological measurement and construct validation.

Because the continuum is crucial to scientific inferences, we argue that construct validation within psychology should adopt the practice of continuum specification. Continuum specification is the process of theoretically defining a continuum and then properly operationalizing it within a measure. This process creates clarity for researchers when they are developing measures and trying to understand interrelations among constructs. In the remainder of this article, we (a) describe why the construct continuum has been ignored in psychology, (b) review problems that occur when the continuum is not considered, and (c) detail the process of continuum specification.

History of the Construct Continuum

Historically, the construct continuum was a fundamental concept in psychological measurement. Psychological scaling was closely tied to psychophysical scaling that used just-noticeable differences between stimuli along a continuum to identify units of measurement (Coombs, 1951; Thurstone, 1927b). Using this model, Thurstone (1927b) characterized psychological measurement as the enumeration of variability along a continuum. Therefore, quantitative methods used for scoring individuals required creating items along the continuum, and individuals could be located on the continuum by the distribution of the items they endorsed (Thurstone, 1928).

Despite the importance of the continuum, techniques for creating measures and scoring individuals were complex and cumbersome (e.g., pairwise comparisons; Thurstone, 1927a). Thurstone (1928) described the scale-development process as requiring more than 100 items from various regions of the continuum and then several hundred readers to sort the items into categories ranging from the most negative to the most positive (e.g., his example contained five categories for negative statements, one category for neutral statements, and five categories for positive statements). After eliminating ambiguous and problematic items, one would obtain a final set of 20 items evenly spread along the continuum. To simplify the process of psychological measurement, Likert (1932) proposed an alternative technique that was significantly less complicated but still yielded reliable assessments. In his approach, items were conceived as parallel tests, so that “each statement [was] a scale in itself” (Likert, 1932, p. 24). Instead of the items, it was the response options (e.g., from “strongly disagree” to “strongly agree”) that divided the continuum. The role of the response options was to partition the continuum into discrete sections; each response option corresponded to a unique and distinct region (i.e., k continuum regions for k response options). Simple summing of responses across the items produced variability on the construct of interest, and the construct continuum, which featured prominently in every phase of Thurstone scaling, became relegated to the background. The pragmatic advantages of Likert-type scaling resulted in a large increase in its popularity, and this method became the mainstay of the self-report technique (Schwarz, 1999).

Consequences of Continuum Neglect

Likert scaling was a substantial innovation in psychological measurement. It allowed scale development to be much less time-consuming and burdensome than it was as originally conceived by Thurstone (1928). Unfortunately, whether it was intended or not, these pragmatic advantages have led to omission of the continuum from psychological measurement. Currently, psychologists are not required or encouraged to think about the continuum of interest when generating items, determining response options, summing scores, and conducting construct validation. In the introduction, we supplied two brief examples of how this omission could lead to problems in assessment of bipolarity and testing for curvilinear relations. In this section, we describe further problems that can be caused by continuum neglect (see Table 1 for a summary of potential measurement problems caused by ignoring the continuum).

Table 1.

Problems Caused by Ignoring the Construct Continuum and Their Solutions

Measurement or validation issue	Problem	Solution
Validity of bipolarity tests	Two constructs are bipolar if they have a strong negative correlation. However, the expected size of the correlation (e.g., –1.00, –.50) depends on how their continua are defined.	Specifying the poles and gradation of the continua allows for clear tests of bipolarity. It sheds light on what correlation one may expect if bipolarity truly exists.
Validity of tests of curvilinearity and moderation	A sufficient test of curvilinear or moderated relations requires that the whole continuum has been properly captured, especially at the region where the curvilinear or moderated effect is expected to occur. Without evidence, one cannot assume that the whole continuum has been properly captured.	Operationalizing the continuum entails checking scale items to see if they can generate enough variability along the entire theoretical continuum to sufficiently capture it (i.e., if they discriminate well across the continuum).
Polarity ambiguity	If the poles remain undefined, then the meaning of very low (or high) scores cannot be determined. Furthermore, if a construct is bipolar, then the lower pole, if not defined outright, could be any number of different opposing qualities (e.g., a low score for perfectionism could indicate either disorganization or carelessness).	Defining the polarity in the beginning of scale development avoids confusion about the substantive meaning of the poles.
Trouble with discriminant validity	Constructs are often assumed to be distinguished by their degree of orthogonality. However, two constructs may be part of a common continuum but inhabit different levels on it.	Defining the theoretical meaning of the continuum poles helps identify constructs that may be part of a single continuum and enables more precise expectations for nomological relationships.
Contamination by negatively worded items	If a continuum is unipolar, reverse-worded items used to prevent response biases may contaminate measurement by measuring a separate, opposite construct. This is the case, for example, in measures of happiness that include items referring to sadness.	Defining the polarity of the continuum prior to scale creation makes it clear when the inclusion of reverse-worded items is appropriate.

Underdefined lower regions and poles

When the continuum is omitted from measurement, the lower regions of the continuum are frequently underdefined. Traditionally, what describes the psychological content of the construct being measured is the construct definition, whose purpose is to identify the target attribute and how it is manifest in behavior (Podsakoff, MacKenzie, & Podsakoff, 2016). For instance, perfectionism is defined as the “setting of excessively high standards for performance accompanied by overly critical self-evaluations” (Frost, Heimberg, Holt, Mattia, & Neubauer, 1993, p. 119). What tends to go unrecognized is that these concepts are often defined positively, in terms of the attribute’s presence; that is, the construct definitions are sometimes not descriptions of the whole continuum but rather describe only the content of its positive range. This can create an asymmetry in how precisely the construct is defined along its continuum, leaving the meaning of the lower pole underdefined.

Underspecification of the lower pole or region of a construct leads to ambiguity in the scientific communication of ideas about the construct and in its operationalization. One ambiguity that arises is polarity ambiguity. For example, some researchers may implicitly define perfectionism as a unipolar construct (i.e., the lower pole is the absence of perfectionism), whereas others may subjectively define perfectionism as a bipolar construct (i.e., the lower pole reflects an opposing quality, such as carelessness). Furthermore, if a construct is implicitly defined to be bipolar, the opposing quality may differ across researchers. Some researchers may define the lower pole of perfectionism as carelessness, whereas others may define it as disorganization. Still others may define it as indifference. As a result of polarity ambiguity, researchers who adopt the same definition of a construct can have fundamentally different understandings of it and use different types of scales and scale content to measure it.

Reverse-worded items

Another problem of ignoring the continuum is that when the continuum is not considered, it is unclear whether reverse-worded items should be included in scales measuring the attribute. It is not uncommon, for instance, to use reverse-worded statements as a way to detect or prevent acquiescence and other response biases (Cronbach, 1950; van Sonderen, Sanderman, & Coyne, 2013). However, whether these items actually capture the target construct or measure a separate opposing construct altogether, which is arguably more important, is rarely considered. For instance, a measure of positive feelings might include a reverse-worded item about negative feelings, but this would contaminate measurement if positive and negative feelings are actually located on two different continua (Bentler, 1969). Similarly, it is possible that reverse-worded items for altruism capture antisocial tendencies and reverse-worded items for engagement capture boredom. Virtue conceptualized on a unipolar continuum provides a hypothetical example. In this case, positively worded items that describe low degrees of virtue are acceptable because they capture low levels of that single continuum (e.g., “I rarely commit acts of heroism”). By contrast, reverse-worded items that bring in the separate opposing construct of vice (e.g., “I enjoy harming others”) are inappropriate. The inclusion of reverse-worded items should be considered appropriate only if they map onto the underlying construct continuum.

Therefore, when the construct of interest has a bipolar continuum (i.e., it is composed of one quality on the upper pole and an opposite quality on the other), reverse-worded statements are valid items for measurement. However, if the construct is unipolar (i.e., it is composed of one quality on the upper pole and its absence on the other), then reverse-worded items may cause undue measurement contamination by including variance from a separate construct. One prominent example of reverse-worded items affecting the factor structure is provided by the Rosenberg Global Self-Esteem scale (Rosenberg, 1965), which consists of both positively worded and reverse-worded statements. These two types of items form two latent factors, which has raised questions about whether the second factor is substantively real or just an artifact of item wording (Marsh, 1996). Another example concerns the Need for Cognition scale (Cacioppo & Petty, 1982). To directly ascertain the impact of reverse wording, researchers manipulated the wordings on this scale. The one-factor model fit best when all items were positively worded or all items were reverse worded, whereas two factors were required to account for two other versions that had a mixture of positively and reverse-worded items (Zhang, Noor, & Savalei, 2016).

Difficulty in interpreting distinctions and continuity in constructs

Another problem with continuum neglect is that it can create practical difficulties for researchers in interpreting whether constructs are orthogonal to each other or part of a larger single continuum. To briefly illustrate this point, we simulated uniformly discriminating scale items along a construct continuum, as shown in Figure 1.¹ Under the assumption that scale items differed only by location along a unidimensional construct continuum, the correlation between items on the low end of the continuum and the high end of the continuum (r = .28) was low enough to argue that they measure distinct constructs that can be interpreted as orthogonal. Further, the correlation between moderate items and items on the high end of the continuum (r = .65) could arguably be used to show that they measure overlapping (or related) but distinct constructs. Thus, correlations alone are not enough to determine discriminant validity of constructs; theory about and specification of the construct continuum are also needed. A concrete example illustrating this point is that high levels of happiness may not be strongly related to moderate levels of happiness even when they are part of a continuum (Tay & Drasgow, 2012).

Fig. 1.

The latent trait density used for the simulation discussed in the text. Scale items were assumed to differ only by location along a single continuum.

The illustration shows that when researchers do not carefully consider the construct continuum, they may characterize a scale as measuring theoretically distinct constructs even though the scale items actually form a single continuum. In order to map a nomological network for a construct, it is important to consider the construct continuum; the traditional paradigm involving convergent and divergent validity may not be as informative because it implicitly assumes that less-than-perfect correlations necessarily indicate orthogonality. As psychological scientists pursue both novelty and unity in their science, it will be vital to understand the extent to which new concepts overlap with old ones and how they are interconnected and can potentially be integrated.

Continuum Specification: Description and Procedures

Continuum specification, our approach for reintegrating the concept of the continuum back into measurement, avoids the problems we have just discussed and consists of two steps: continuum definition and continuum operationalization. Table 2 provides an overall summary of this process.

Table 2.

Conceptual Overview of the Process of Continuum Specification

Stage and step	Guiding questions
Continuum definition
1. Define the polarity of the continuum	Is the continuum unipolar, bipolar, or combinatorial?Is the lower pole a simple absence of the target quality or the presence of an opposite quality?
2. Specify the nature of the gradation along the continuum	What is the substantive quality that the different degrees of the continuum represent? Is it experienced intensity, behavioral extremity, belief strength, frequency or timing, or some other type of gradation?
Continuum operationalization
1. Generate items while considering the continuum	What regions of the continuum does each item map on to?Are reverse-worded items appropriate?
2. Select response options that generate enough variability along the continuum and match its polarity	Do the items divide the continuum in many different ways so that individuals across the continuum can be discriminated?Does the polarity of the response format match the polarity of the construct continuum?
3. Assess dimensionality using a method that aligns with the type of continuum	Should one use factor analysis (unipolar or bipolar constructs) or multidimensional scaling (combinatorial constructs)?
4. Identify the actual response process used by subjects and ensure that the dimensionality and validation procedures assume the correct process	Do respondents engage in a dominance or ideal-point response process?Should dimensionality assessment and scoring procedures that account for ideal-point responding be used?

Continuum definition

Construct polarity

A fundamental aspect of constructs is the theoretical meaning of their two continuum poles—that is, the substantive meaning of high and low scores. As we discussed earlier, some construct definitions may describe only the meaning of the upper pole, leaving the lower pole underdefined. This can lead to problems, such as inability to evaluate consequential validity, ambiguity in bipolarity assessments, and possible contamination from reverse-worded items. These issues can be avoided by considering the polarity of the construct continuum prior to scale development.

The polarity of a construct determines the meaning of its two poles. There are three types of polarity: A construct may be unipolar, bipolar, or combinatorial. When a construct is unipolar, the presence of the construct is on the upper end of the continuum, and the absence of the construct is on the lower end. For instance, according to one perspective, the domain of virtue includes only the presence and absence of virtuous behaviors and excludes malevolent behaviors altogether. Thus, the lower end of the continuum reflects merely an absence of virtue but not the presence of vice.

By contrast, when a construct is bipolar, the presence of its content is on the upper end of the continuum, and the presence of the opposing content is on the lower end. Bipolar continua, therefore, place two opposing concepts within a single overarching construct (e.g., hot vs. cold, militarism vs. pacifism, satisfied vs. dissatisfied). For example, a researcher may be interested in the measurement of character and conceptualize it as bipolar, ranging from bad to good and entailing behaviors pertaining to virtue and vice.

Finally, combinatorial polarity can be considered a union of unipolarity and bipolarity. In a combinatorial continuum, two distinct constructs with unipolar continua are merged to yield a single continuum. For example, personality researchers have defined the construct relative motive strength in terms of the relative influence that the motive to achieve (approach) has compared with the motive to avoid failure (avoidance; Atkinson, 1957; James, 1998; James & LeBreton, 2012). Similarly, a combinatorial construct continuum has also been used to measure vocational interest, by positioning realistic interests (working with things) in opposition to social interests (working with people; Prediger, 1982), although these two types of interests have positive zero-order correlations (Tay, Su, & Rounds, 2011). Empirically, combinatorial continua can often be found in multidimensional-scaling (MDS) dimensions. This is because MDS maps items on the basis of relative distances rather than absolute relations (e.g., r = −1.00), and items that are positively related can even be mapped as opposing ends of an MDS dimension (see Tay et al., 2011). Generally, if the construct itself expresses a relation between two distinct types of constructs, then the continuum is combinatorial.

Combinatorial continua are generally easier to identify than unipolar and bipolar continua, which may not always be easy to distinguish. Drawing on theoretical reasoning, a heuristic for resolving this dilemma is to answer the following question: Does the absence of the construct content logically entail an opposite quality? If it is possible to have none of its content without having the opposing content, then the continuum is unipolar rather than bipolar. In figuring out a construct’s polarity, one should identify several possible opposing concepts and consider whether an absence of the construct entails them. Consider again the example of perfectionism, defined as “setting of excessively high standards for performance accompanied by overly critical self-evaluations” (Frost et al., 1993, p. 119). Potential opposing qualities might be carelessness, indifference, and disorganization. However, it is possible to have low perfectionism without having any of these traits. Given that there are nonperfectionists who are also not careless, indifferent, or disorganized, perfectionism is not necessarily the inverse of these other concepts, and this is an indicator of unipolarity.

Another approach to distinguishing unipolarity and bipolarity is to consider how the construct content falls within the population distribution. For example, the content of perfectionism falls in the range from average levels of standards (at the low end of the continuum) to excessively high levels (at the high end of the continuum). The fact that the low end of the content continuum is anchored in the middle of the general population distribution suggests that perfectionism is a unipolar construct.

Our goal here is not to dictate the approach a researcher should take but to provide possible directions for considering continuum polarity. We hope that researchers can recognize that specifying the polarity of a continuum requires careful consideration. They need to use the definition of a construct to identify its polarity or else make a claim about its polarity and justify it on the basis of theory, population distributions, or some other consideration.

Specifying the nature of the gradation

The second step in defining a continuum involves stating the nature of the gradation along the continuum. At its core, every continuum represents different degrees of a target attribute (e.g., high, moderately high, or moderate levels of neuroticism). However, what actually makes these degrees different? What is the psychological content that varies in magnitude along the continuum? Answering these questions can begin with the construction of a construct map (Wilson, 2005). A construct map is a simple figure that has a vertical line representing the continuum. On one side of the line, item content, item responses (e.g., from “strongly disagree” to “strongly agree”), or both are listed at their approximate locations on the continuum. On the other side, the corresponding types of individuals are listed (e.g., “individuals with high levels of perfectionism,” “individuals with moderately low levels of perfectionism”). Thus, both individuals and items are mapped in order to provide measurement clarity:

The central idea in using the construct mapping concept at the initial stage of instrument development is for the measurer to focus on the essential feature of what is to be measured—in what way does an individual show more of it and less of it. . . . Before this can be done, however, the measurer often has to engage in a process of ‘variable clarification,’ where the construct to be measured is distinguished from other, closely related, constructs. Reasonably often the measurer finds that there were several constructs lurking under the original idea. (Wilson, 2005, p. 38)

Mapping a variety of constructs reveals different types of construct gradations. For constructs such as momentary affect, satisfaction, and fatigue, the gradation is defined by something one can call experienced intensity. Simply put, the intensity of a psychological experience can vary, and the continuum indexes this variability. For example, the high region of the positive-affect continuum represents happiness that is experienced more intensely than the happiness represented by the middle region of the continuum.

For personality-type constructs, the nature of the gradation is often defined instead by behavioral extremity (we are using behavior broadly to mean thoughts, feelings, and behaviors, given existing disagreements about how traits should even be defined). For example, individuals with very high levels of neuroticism display behaviors such as high self-criticism, a high focus on themselves, and panic attacks, whereas individuals who are moderate in neuroticism may experience only modest levels of self-focus and anxiety over test results. In addition, there are constructs related to perceptions, or beliefs, about features of the world. These constructs include, for example, perceived support (belief in how much one is supported by other people), locus of control (belief in how much events are determined by the external world vs. oneself), and self-efficacy (belief about one’s competence in a given domain). In these cases, the gradation is belief strength. All beliefs are “about” something (Searle, 1983). That is, they concern some aspect of the world. For instance, individuals high in self-efficacy believe that, in reality, they are highly competent. Similarly, individuals high in perceived support believe that they have a great deal of social support. Finally, there are construct continua that vary according to the frequency or timing of behaviors. These include continua that are based on regularity of occurrence. For example, the Positive and Negative Affect Schedule (Watson, Clark, & Tellegen, 1988) explicitly assesses the frequency of positive and negative feelings over a given interval of time.

We recognize that the examples of gradations given here refer to realizations, or effects, of constructs (e.g., frequency of behaviors) rather than purely theoretical content. However, we believe that this reflects the pragmatics of the construct-definition process, which requires psychological constructs to be grasped in concrete terms for clear scientific and lay communication. Ultimately, there are many types of constructs in psychology, and some of them may have gradations that do not fit neatly into the types we have described. In these cases, these types can be adapted, or the researcher can identify and label a new type of gradation. Although we have given guidance here, researchers must think carefully about their own context and adapt the process of continuum specification accordingly. However, it is always important to consider what the gradation of a continuum means substantively, in order to shed light on the theory of the construct.

In current scale-creation practices, construct definitions describe the psychological attributes being measured, but often do not explicitly state the nature of the gradation. This theoretical ambiguity can lead to empirical confusion. Within psychology, one long-standing debate that has partly arisen from this ambiguity concerns whether happiness and sadness are bipolar or unipolar (see Greenwald, 2012). More recent clarifications on how their continua are graded have helped resolve this debate. When the gradation of happiness and sadness represents experienced intensity, investigations yield evidence for bipolarity (Russell & Carroll, 1999) because happiness and sadness appear to be mutually exclusive in momentary episodes (e.g., it is difficult in a single moment to be both extremely sad and happy). However, when the gradations of happiness and sadness represent frequency or timing over the previous day or week (i.e., the magnitude of happiness is represented by the number of happy moments, and the magnitude of sadness is represented by the number of sad moments), happiness and sadness are only weakly or moderately correlated negatively (Diener, 1999; Russell & Carroll, 1999). This is because happiness and sadness can co-occur over longer periods of time (i.e., individuals may have multiple nonoverlapping episodes of happiness and sadness). Unless the gradation of the continuum is defined, seemingly conflicting conclusions about happiness may be drawn. However, they are not really in conflict, as they simply involve constructs with different continua (e.g., momentary positive affect vs. long-term positive affect) being perceived as one on account of a common label (“positive affect”). Apart from reducing ambiguity, detailing the nature of a continuum’s gradation makes it clearer how items can be generated across its range, an aspect of continuum operationalization that we describe later in this article.

Summary

In general, a construct continuum is defined in two steps. First, one defines the polarity of the theoretical construct. This amounts to (a) generating or accessing a construct definition and then (b) determining if the lower pole represents the absence of the target attribute (a unipolar continuum), an opposite quality altogether (a bipolar continuum), or a mixture (a combinatorial continuum). Second, one defines what the degrees of the continuum represent. A continuum represents variability in some psychological phenomenon, and what this is must be made explicit. In short, researchers conceiving new constructs, evaluating old ones, or carrying out construct validation should consider addressing the following practical questions during the first stage of continuum specification:

What do the end poles of the postulated continuum mean? Is the construct being measured bipolar, unipolar, or combinatorial?

How are the different degrees of the continuum differentiated? What is the actual difference among levels in the low, medium, and high regions of the continuum? Do these levels differ in experienced intensity, behavioral extremity, belief strength, frequency or timing, or some other quality?

Continuum operationalization

Defining the continuum provides conceptual grounding for the target attribute. However, the continuum should also be fully considered in the process of designing items, designing response options, and assessing factorial validity (factor structure). In this section, we describe how continua are operationalized with regard to four aspects of scale development: creating items, choosing response formats, assessing dimensionality, and identifying underlying response processes.

Generating items

Psychological scales are composed of items, and to operationalize a continuum, one must create items so that the entire continuum is sufficiently measured. This results in sufficient variability to meaningfully distinguish individuals located in all regions of the continuum. Selecting items that can discriminate the different points on the continuum provides content validity because the full span of the construct is measured. This idea has been conveyed indirectly in the past, in cautionary comments that overly high internal consistencies may denote a lack of heterogeneity in item content (Cortina, 1993; Schmitt, 1996). These calls for heterogeneity in item content are essentially calls for heterogeneity in item location along the continuum. Such heterogeneity is relevant not only to item generation but also to selection of response formats (which we discuss next), because it is not just the item alone but rather the item and its response options together that generate measurable responses.

As we alluded to earlier, one key question in choosing scale items is whether to include reverse-worded statements. Although such statements are commonly used in psychological scales, having a clear continuum definition provides grounding for whether they are appropriate. If the construct is bipolar, there is a conceptual rationale for including reverse-worded items because the opposing concept they capture is part of the continuum. However, if the construct is unipolar, some reverse-worded items may not be appropriate. For example, in a perfectionism scale, some reverse-worded items are valid because they capture a lack of perfectionist tendencies (e.g., “I do not need things to be perfect”). Conversely, other reverse-worded items may be inappropriate because they capture opposing constructs, such as disorganization, that may be on a separate continuum altogether (e.g., “I am disorganized”). Such items contaminate measurement even though they have face validity for measuring perfectionism.

Continuum specification promotes enhanced accuracy in item wordings, which is especially critical for assessing contributions of new constructs and new areas of research. For example, one growing area is positive psychology. Scientific progress in this area will depend on researchers’ ability to assess the incremental validity of positive-psychology constructs. Psychometric validity of positive-psychology constructs that possibly differ from existing constructs by degree (e.g., grit vs. conscientiousness; Duckworth, Peterson, Matthews, & Kelly, 2007) would be demonstrated if item wordings for the positive-psychology constructs place them at more extreme ends of the continuum (e.g., “I never give up” vs. “I make plans and follow through with them”). A careful choice of scale items is needed to reflect a construct’s extreme location on a continuum. We note here that assessing the content of item wordings along a continuum is one aspect of distinguishing constructs, and there are other potential aspects that need to be considered as well (e.g., see Credé, Tynan, & Harms, 2017, on psychometric issues concerning grit vs. conscientiousness).

Choosing response options and formats

Technically speaking, in Likert-type scaling, the items themselves do not vary in location (Torgerson, 1958).² Rather, it is the response options that divide the continuum into discrete segments, and items vary in where these divisions take place (Torgerson, 1958). For instance, two items for measuring extraversion are “am the life of the party” and “feel comfortable around people” (Goldberg et al., 2006). Although these items measure the same continuum, the response options are mapped to different regions. The region corresponding to “agree” is located lower for “feel comfortable around people” than for “am the life of the party.” This is because it takes a lower degree of extraversion to be at ease around other people than it does to be the “life of the party.” Thus, all scale items in Likert scaling do measure the continuum, but they differ in where they segment it.

Properly operationalizing a continuum therefore requires that the response options for different items vary in how they divvy up the continuum. This enables the scale to capture the locations of individuals who exhibit many different degrees of the target attribute. Practically speaking, achieving variability in where the items divide the continuum requires considering how the response options and continuum are related. One can assess approximately how the response options for each item divide the continuum and whether there is sufficient variability across items to capture its entire range. Usually, a modest number of items will produce the desired variability. However, a scale with just a few items may not (e.g., if two items have the same five segmentations, then the scale can distinguish only five degrees of the continuum). Even if a full examination of all items and their response options is not performed (this would take significant time for long scales), whether the items will produce enough variability is something that should at least be considered during scale construction.

Furthermore, operationalizing a continuum requires selecting response options that are appropriate to it. It has been proposed that item response formats themselves can vary in their polarity (Russell & Carroll, 1999). For valid measurement, the polarity of the response options must match the polarity of the construct. That is, scales for unipolar constructs must have unipolar response formats, and scales for bipolar constructs must have bipolar response formats. How does one know the polarity of a response format? It is a function of both the verbal labels and the numbers assigned to these labels. For example, it has been shown that when the number 0 is assigned to the lowest response option, respondents interpret this as an absence of the target attribute, which signals a unipolar construct (Schwarz, Knauper, Hippler, Noelle-Neumann, & Clark, 1991). However, when the same item is used but the number assigned to the lowest option is changed to −5 (0 being set as the midpoint), respondents interpret the low end as representing the opposite of the target attribute, which signals a bipolar construct. More generally, this research has revealed that “a format that ranges from negative to positive numbers conveys . . . a bipolar dimension. . . . In contrast, a format that uses only positive numbers conveys a . . . unipolar dimension” (Schwarz, 1999, p. 96). Thus, in developing a scale, one must choose the appropriate numbers to represent the continuum.

As we have noted, the polarity of the response format is also a function of the verbal labels. For example, in the measurement of happiness and sadness, response options often range from labels such as “not at all” to “extremely.” Verbal labels like these indicate a unipolar continuum because they do not name opposite concepts and imply that only one quality is being measured. By contrast, other response options, such as options ranging from “very sad” to “very happy” (e.g., Watson et al., 1988), clearly imply a bipolar continuum because an opposing concept is explicitly named. Why is the polarity of the response options important? Consider the long-standing debate regarding whether happiness and sadness are on the same continuum. A review of past emotions research revealed that the negative association between happiness and sadness was weaker when response formats were unipolar (i.e., “not at all” to “extremely”) than when they were bipolar (Russell & Carroll, 1999). This is because individuals who are “not at all” happy can experience any degree of sadness, and individuals who are “not at all” sad may be experiencing any degree of happiness. This leads to an L-shaped bivariate distribution rather than the diagonal bivariate distribution expected for a correlation of −1.00. The authors showed mathematically that even a correlation of −.47, rather than −1.00, would be sufficient evidence to conclude that these two constructs are bipolar if unipolar-formatted items are used (Russell & Carroll, 1999).

This case shows that scientific conclusions regarding the nature of a continuum depend on the polarity of the response options and how it aligns with the actual polarity of the continuum. If polarity is ignored, mismatches may occur, and such mismatches can lead to ambiguities or errors that might not be detected using standard validation procedures because the multiple variations of constructs measured (e.g., unipolar and bipolar conceptualizations of happiness) are not considered in the patterns of empirical relations. Thus, in developing a scale, one needs to ensure that the response options match the defined polarity of the construct’s continuum. This is a requirement for valid measurement, and polarity mismatches can be avoided by defining the continuum prior to creating the scale and then selecting an appropriate response format.

Assessing dimensionality

Another critical part of operationalizing a continuum is assessing the construct’s dimensionality and using the correct type of procedure to do this. Dimensionality can be assessed using MDS or factor analysis. However, the appropriate approach depends on the polarity of the continuum. Specifically, MDS dimensions are usually appropriate for combinatorial continua (and sometimes appropriate for bipolar continua), whereas factor analysis dimensions are appropriate for unipolar continua and often appropriate for bipolar continua.

Unlike factor analysis dimensions, uncovered MDS dimensions are dependent on the relative strength of association (e.g., correlation, distance) among all scale items rather than the absolute magnitude of association. In MDS, two different sets of associations can produce the same dimensions because MDS focuses on scaling items relative to one another in a multidimensional space. As shown in Tay et al. (2011), when two scales have a correlation of 0.00, they may be placed as opposites within a circumplex structure. Similarly, a correlation of −1.00 between two scales may also produce an opposition within a circumplex structure. A study in which MDS was applied to vocational-interest data showed that opposing interest poles (e.g., conventional interests vs. artistic interests; Prediger, 1982) have meta-analytic correlations of .08 to .16. These correlations are smaller than those between adjacent poles, which range from .33 to .52 (Tay et al., 2011). This pattern arises because the position of two scales depends on the relative associations taken across all the scales in the analysis rather than the absolute strength of association.

Therefore, opposing poles on a single MDS dimension do not necessarily indicate bipolarity in the sense that one should observe a perfect inverse correlation, or mutual exclusion. Instead, poles reflect the extent to which scale items are more dissimilar with respect to all other scale items (or scales). To a large extent, MDS dimensions reflect combinatorial continua in that opposing poles are not necessarily mutually exclusive and therefore do not match the traditional understanding of bipolarity (i.e., opposing MDS end poles can both be endorsed by the same person). MDS dimensions frequently reflect the comparative locations for the construct space of interest. This is true for many circumplex models in psychology, such as those for family and marital systems (Olson, Sprenkle, & Russell, 1979), interpersonal behaviors (Wiggins, 1982), and cultural values (Schwartz & Wolfgang, 1987). It is important to note, however, that MDS dimensions can still reflect bipolarity if the opposing poles have large absolute negative correlations, as in the case of the emotional valence of happiness and sadness (Green, Goldman, & Salovey, 1993; Russell, 1980; Russell & Bullock, 1985).

In factor analysis, the dimensions recovered reflect absolute magnitudes of associations (i.e., correlations). Ideally, a single bipolar dimension from a factor analysis would have scale items that fall along a single dimension and have loadings that can range from −1 to +1. A unipolar factor analysis dimension would have scale items with positive loadings that can range from 0 to +1 (e.g., Green et al., 1993). Nevertheless, even with factor analysis, how individuals respond to scale items—that is, the item response process—can limit inferences about the polarity of the construct continuum, as we describe next.

Identifying and modeling the response process

Finally, operationalizing a continuum means taking into account the response process, defined as “the psychological processes that respondents . . . use when completing a measure” (Furr & Bacharach, 2013, p. 209). If a measure has construct validity, this means that the response process assumed by the scale matches how subjects actually respond (Drasgow, Chernyshenko, & Stark, 2010; Furr & Bacharach, 2013; Tay, Drasgow, Rounds, & Williams, 2009). Moreover, one cannot understand the response process completely without considering the construct continuum. The usual assumption is that individuals endorse (or answer correctly, in the case of constructs related to ability) only those items that they exceed in location on the continuum. Conversely, it is assumed that individuals do not endorse (or answer correctly) items whose locations they do not exceed. For example, individuals with a given math ability will correctly answer all math items that are lower on the continuum than where their ability lies and will fail to answer math items that are higher on the continuum. This model of the response process is called the dominance model (Coombs, 1964) because individuals “dominate” items that are lower on the continuum than where they stand. It should be noted that this model is most applicable for measuring the maximal performance of individuals, or constructs that are ability based.

Dominance responding is an assumption that underlies classical test theory, factor analysis, logistic item response theory models, and reliability estimates such as Cronbach’s alpha (Tay & Drasgow, 2012). Nevertheless, recent advances in psychological measurement have shown that the dominance model is less appropriate for constructs related to self-reported typical behaviors, such as the trait, value, attitudinal, and affective constructs that are common in many areas of psychology (Chernyshenko, Stark, Chan, Drasgow, & Williams, 2001; LaPalme, Tay, & Wang, 2017; Roberts, Donoghue, & Laughlin, 2000; Tay & Drasgow, 2012; Tay et al., 2009). For these constructs, endorsement does not occur simply because the individual’s level of the latent attribute exceeds the item’s location on the continuum; instead, endorsement depends on how close the individual’s level is to the item’s level. For example, individuals who are extremely happy will not endorse moderate happiness, and individuals who are moderately happy will not endorse low happiness (Tay & Drasgow, 2012). Thus, when responding to these types of items, individuals look for a match with their own standing. The probability of endorsement decreases as the item and individual become further removed, that is, as the individual perceives that the item is less representative of his or her own standing. This response process underlying the measurement of many trait, value, attitudinal, and affective constructs is known as the ideal-point model (Coombs, 1964) because it involves individuals identifying their own location on the continuum and then searching among the response options for an “ideal point” (Drasgow et al., 2010).

It has been shown that rote application of the standard dominance model to constructs such as traits and attitudes can result in suboptimal measures because the full range of items cannot be easily modeled (Chernyshenko, 2002; Chernyshenko, Stark, Drasgow, & Roberts, 2007). Gathering evidence for unidimensionality is also problematic with procedures that incorrectly assume a dominance process. Specifically, factor analytic and principal components models can spuriously generate two separate dimensions when only a single bipolar continuum has been measured (Davison, 1977; van Schuur & Kiers, 1994). This occurs because scale items with similar locations tend to be coendorsed by respondents. Because positive and negative items near the middle of the continuum are coendorsed by a majority of respondents, in dimensionality analyses these items are separated into their own latent variable that is distinct from the positivity-to-negativity dimension. This issue has recently been shown to be a critical factor in the controversy concerning whether happiness and sadness form a bipolar construct, because individuals in the middle of a bipolar happy-to-sad continuum endorse low levels of both happiness and sadness (Tay & Kuykendall, 2017). A similar artifactual dimension can result for unipolar constructs. For instance, when happiness is measured on a unipolar continuum, individuals distinguish high levels of happiness from moderate levels of happiness, so these appear to load on separate factors rather than a single common factor (Tay & Drasgow, 2012).

Recent research suggests that using the appropriate measurement model may be important not only for assessing dimensionality but also for assessing the presence of curvilinear relations in empirical data (e.g., Carter et al., 2013; cf. Wiese et al., 2018). Simulation results suggest that when individuals use ideal-point responding, applying ideal-point measurement models increases the statistical power to detect curvilinear relations if they are present and results in more accurate point estimates (Cao, Song, & Tay, 2018). In general, properly operationalizing a continuum means considering the relationships among the continuum, the response process assumed by analytic procedures, and the underlying response process. Researchers need to be mindful of the difference between dominance and ideal-point response models and the constructs they may be best suited for.

Because individuals engage in an ideal-point response process when self-reporting typical behaviors such as personality, attitudes, and emotions, dimensionality analyses that account for ideal-point response processes need to be developed (Habing, Finch, & Roberts, 2005). This will enable researchers to accurately determine the dimensionality of a construct continuum when a full range is specified in the measure (e.g., happy to sad). Further, ideal-point responding also requires methodology that enables quantification of item locations and scoring of individuals along the continuum (e.g., Roberts et al., 2000). Moreover, because measurement-reliability statistics, such as Cronbach’s alpha, assume dominance responding, different approaches are required in the case of ideal-point responding. These methodologies need to be developed and made more readily available to psychologists in order to advance the operationalization and measurement of construct continua.

Continuum Specification in the Context of Construct Validation

We note that continuum specification does not exist in isolation but falls squarely within the broader context of construct validation. It does not replace locating target constructs within a broader nomological network (Cronbach & Meehl, 1955) through tests of convergent and discriminant validity (Campbell & Fiske, 1959), tests of consequential validity (Messick, 1995), and other established processes for measurement development and construct validation (Clark & Watson, 1995). Continuum specification, however, emphasizes a neglected aspect of construct validation and augments these standard procedures. We believe it is an especially timely process in light of recent work in positive psychology (Seligman & Csikszentmihalyi, 2000) and interest in both the dark and the bright attributes (Furnham, Trickey, & Hyde, 2012; Judge, Piccolo, & Kosalka, 2009), as researchers in these areas seek to assess the bipolarity of proposed constructs and curvilinear effects. Moreover, we believe that continuum specification can bring greater clarity to researchers’ understanding of the nomological networks of constructs.

A practical question for researchers is, at what stage within construct validation does continuum specification fall? Part of continuum specification is defining the continuum’s polarity and gradation, during the initial stage of defining the construct. However, continuum specification also has implications for developing items (e.g., selecting a response format), assessing dimensionality, and modeling the response process. Continuum specification is therefore not so much a single stage or procedure but rather an approach that involves taking the latent continuum into account across the whole process of scale development, validation, and evaluation. Although we are unable to provide a comprehensive account of continuum specification’s implications for every construct-validation context, we highlight several here. First, with regard to nomological networks, continuum specification encourages researchers to consider whether target constructs are distinguished from other constructs by quality (i.e., they are orthogonal) or by degree (i.e., they are on different parts of the same continuum). As demonstrated in Figure 1, constructs that differ by degree may incorrectly be interpreted as orthogonal because of less than perfect correlations. Second, claims for polarity and gradation can potentially provide greater clarity to debates on dimensionality, such as the debates on the bipolarity of happiness and sadness (Tay & Kuykendall, 2017) and on whether psychopathology is part of broader personality dimensions (Samuel & Tay, 2018). Third, the procedure of continuum specification outlined in Table 2 can guide researchers through key questions to gain clarity on how to define and operationalize target continua, and thereby contribute to creating, validating, and evaluating psychological measures. Fourth, in examinations of factorial validity, continuum specification can help researchers determine the appropriateness of factor analysis. Finally, in assessments of predictive validity, continuum specification can help provide insight on whether curvilinear effects are likely present conceptually or empirically, because it encourages inclusion of the whole continua for the measures of both the predictor and outcomes.

Conclusion

Psychologists recognize that advancements in empirical methods can potentially resolve scientific debates and advance science (Greenwald, 2012). With this goal in mind, we propose that the concept of the construct continuum has been neglected within the field of psychological measurement. Through the procedures of continuum specification, researchers are able to develop rigorous definitions of the constructs themselves and to provide more accurate operationalizations. Consequently, they are better able to draw scientific inferences from measures of constructs. Thus, the concept of continuum specification can provide psychological science with the awareness, semantics, and impetus to address existing and future questions regarding constructs and the validation of measures that underlie the work in this field.

Footnotes

Acknowledgements

We wish to thank James LeBreton for his invaluable feedback on an earlier draft of the manuscript. We also wish to thank Lauren Kuykendall and Vincent Ng for providing editorial feedback and comments on this topic.

Action Editor

Jennifer L. Tackett served as action editor for this article.

Author Contributions

L. Tay generated the idea for this article and wrote the first draft of the manuscript. A. T. Jebb further developed the ideas and provided critical edits. Both authors approved the final submitted version of the manuscript.

Declaration of Conflicting Interests

The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.

Notes

References

Atkinson

J. W.

(1957). Motivational determinants of risk-taking behavior. Psychological Review, 6, 359–372.

Bagozzi

R. P.

Phillips

L. W.

(1991). Assessing construct validity in organizational research. Administrative Science Quarterly, 36, 421–458.

Bentler

P. M.

(1969). Semantic space is (approximately) bipolar. The Journal of Psychology, 71, 33–40.

Binning

J. F.

Barrett

G. V.

(1989). Validity of personnel decisions: A conceptual analysis of the inferential and evidential bases. Journal of Applied Psychology, 74, 478–494.

Bollen

K. A.

(2002). Latent variables in psychology and the social sciences. Annual Review of Psychology, 53, 605–634.

Borsboom

Mellenbergh

G. J.

van Heerden

(2003). The theoretical status of latent variables. Psychological Review, 110, 203–219.

Cacioppo

J. T.

Petty

R. E.

(1982). The need for cognition. Journal of Personality and Social Psychology, 42, 116–131.

Campbell

D. T.

Fiske

D. W.

(1959). Convergent and discriminant validity by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105.

Cao

Song

Q. C.

Tay

(2018). Detecting curvilinear relationships: A comparison of scoring approaches based on different item response models. International Journal of Testing, 18, 178–205. doi:10.1080/15305058.2017.1345913

10.

Carter

N. T.

Dalal

D. K.

Boyce

A. S.

O’Connell

M. S.

Kung

M.-C.

Delgado

K. M.

(2013). Uncovering curvilinear relationships between conscientiousness and job performance: How theoretically appropriate measurement makes an empirical difference. Journal of Applied Psychology, 99, 564–586.

11.

Chernyshenko

O. S.

(2002). Applications of ideal point approaches to scale construction and scoring in personality measurement: The development of a six-faceted measure of conscientiousness (Unpublished doctoral dissertation). University of Illinois at Urbana–Champaign.

12.

Chernyshenko

O. S.

Stark

Chan

K. Y.

Drasgow

Williams

(2001). Fitting item response theory models to two personality inventories: Issues and insights. Multivariate Behavioral Research, 36, 523–562. doi:10.1207/S15327906MBR3604_03

13.

Chernyshenko

O. S.

Stark

Drasgow

Roberts

B. W.

(2007). Constructing personality scales under the assumptions of an ideal point response process: Toward increasing the flexibility of personality measures. Psychological Assessment, 19, 88–106. doi:10.1037/1040-3590.19.1.88

14.

Clark

L. A.

Watson

(1995). Constructing validity: Basic issues in objective scale development. Psychological Assessment, 7, 309–319.

15.

Coombs

C. H.

(1951). Mathematical models in psychological scaling. Journal of the American Statistical Association, 46, 480-489.

16.

Coombs

C. H.

(1964). A theory of data. New York, NY: Wiley.

17.

Cortina

J. M.

(1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78, 98–104.

18.

Credé

Tynan

M. C.

Harms

P. D.

(2017). Much ado about grit: A meta-analytic synthesis of the grit literature. Journal of Personality and Social Psychology, 113, 492–511.

19.

Cronbach

L. J.

(1950). Further evidence on response sets and test design. Educational and Psychological Measurement, 10, 3–31.

20.

Cronbach

L. J.

Meehl

P. E.

(1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302.

21.

Davison

M. L.

(1977). On a metric, unidimensional unfolding model for attitudinal and developmental data. Psychometrika, 42, 523–548.

22.

Diener

(1999). Introduction to the special section on the structure of emotion. Journal of Personality and Social Psychology, 76, 803–804.

23.

Drasgow

Chernyshenko

O. S.

Stark

(2010). 75 years after Likert: Thurstone was right! Industrial and Organizational Psychology, 3, 465–476.

24.

Duckworth

A. L.

Peterson

Matthews

M. D.

Kelly

D. R.

(2007). Grit: Perseverance and passion for long-term goals. Journal of Personality and Social Psychology, 92, 1087–1101.

25.

Frost

R. O.

Heimberg

R. G.

Holt

C. S.

Mattia

J. I.

Neubauer

A. L.

(1993). A comparison of two measures of perfectionism. Personality and Individual Differences, 14, 119–126.

26.

Furnham

Trickey

Hyde

(2012). Bright aspects to dark side traits: Dark side traits associated with work success. Personality and Individual Differences, 52, 908–913.

27.

Furr

R. M.

Bacharach

V. R.

(2013). Psychometrics: An introduction (2nd ed.). Thousand Oaks, CA: Sage.

28.

Ghiselli

E. E.

(1964). Theory of psychological measurement. New York, NY: McGraw-Hill.

29.

Goldberg

L. R.

Johnson

J. A.

Eber

H. W.

Hogan

Ashton

M. C.

Cloninger

C. R.

Gough

H. C.

(2006). The International Personality Item Pool and the future of public-domain personality measures. Journal of Research in Personality, 40, 84–96.

30.

Grant

A. M.

(2013). Rethinking the extraverted sales ideal: The ambivert advantage. Psychological Science, 24, 1024–1030.

31.

Green

D. P.

Goldman

S. L.

Salovey

(1993). Measurement error masks bipolarity in affect ratings. Journal of Personality and Social Psychology, 64, 1029–1041.

32.

Greenwald

A. G.

(2012). There is nothing so theoretical as a good method. Perspectives on Psychological Science, 7, 99–108.

33.

Habing

Finch

Roberts

(2005). A Q3 statistic for unfolding item response theory models: Assessment of unidimensionality with two factors and simple structure. Applied Psychological Measurement, 29, 457–471.

34.

James

L. R.

(1998). Measurement of personality via conditional reasoning. Organizational Research Methods, 1, 131–163.

35.

James

L. R.

LeBreton

J. M.

(2012). Assessing the implicit personality through conditional reasoning. Washington, DC: American Psychological Association.

36.

Judge

T. A.

Piccolo

R. F.

Kosalka

(2009). The bright and dark sides of leader traits: A review and theoretical extension of the leader trait paradigm. The Leadership Quarterly, 20, 855–875.

37.

LaPalme

Tay

Wang

(2017). A within-person examination of the ideal-point response process. Psychological Assessment, 30, 567–581. doi:10.1037/pas0000499

38.

Likert

(1932). A technique for the measurement of attitudes. Archives of Psychology, 140, 5–53.

39.

Marsh

H. W.

(1996). Positive and negative global self-esteem: A substantively meaningful distinction or artifactors? Journal of Personality and Social Psychology, 70, 810–819.

40.

Messick

(1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741–749.

41.

Nunnally

J. C.

(1978). Psychometric theory. New York, NY: McGraw-Hill.

42.

Olson

D. H.

Sprenkle

D. H.

Russell

C. S.

(1979). Circumplex model of marital and family systems: I. Cohesion and adaptability dimensions, family types, and clinical applications. Family Process, 18, 3–28.

43.

Podsakoff

MacKenzie

Podsakoff

(2016). Recommendations for creating better concept definitions in the organizational, behavioral, and social sciences. Organizational Research Methods, 19, 159–203.

44.

Prediger

D. J.

(1982). Dimensions underlying Holland’s hexagon: Missing link between interests and occupations. Journal of Vocational Behavior, 21, 259–287. doi:10.1016/0001-8791(82)90036-7

45.

Roberts

J. S.

Donoghue

J. R.

Laughlin

J. E.

(2000). A general item response theory model for unfolding unidimensional polytomous responses. Applied Psychological Measurement, 24, 3–32.

46.

Rosenberg

(1965). Society and the adolescent child. Princeton, NJ: Princeton University Press.

47.

Russell

J. A.

(1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39, 1161–1178.

48.

Russell

J. A.

Bullock

(1985). Multidimensional scaling of emotional facial expressions: Similarity from preschoolers to adults. Journal of Personality and Social Psychology, 48, 1290–1298. doi:10.1037/0022-3514.48.5.1290

49.

Russell

J. A.

Carroll

J. M.

(1999). On the bipolarity of positive and negative affect. Psychological Bulletin, 125, 3–30.

50.

Samuel

D. B.

Tay

(2018). Aristotle’s golden mean and the importance of bipolarity for personality models: A commentary on “Personality traits and maladaptivity: Unipolarity versus bipolarity.” Journal of Personality. Advance online publication. doi:10.1111/jopy.12383

51.

Schmitt

(1996). Uses and abuses of coefficient alpha. Psychological Assessment, 8, 350–353.

52.

Schwartz

S. H.

Wolfgang

(1987). Toward a universal psychological structure of human values. Journal of Personality and Social Psychology, 53, 550–562.

53.

Schwarz

(1999). Self-reports: How the questions shape the answer. American Psychologist, 54, 93–105.

54.

Schwarz

Knauper

Hippler

H.-J.

Noelle-Neumann

Clark

(1991). Rating scales: Numeric values may change the meaning of scale labels. Public Opinion Quarterly, 55, 570–582.

55.

Searle

J. R.

(1983). Intentionality: An essay in the philosophy of mind. Cambridge, MA: Cambridge University Press.

56.

Seligman

M. E. P.

Csikszentmihalyi

(2000). Positive psychology: An introduction. American Psychologist, 55, 5–14.

57.

Tay

Drasgow

(2012). Theoretical and statistical issues in the assessment of construct dimensionality: Accounting for the item response process. Organizational Research Methods, 15, 363–384.

58.

Tay

Drasgow

Rounds

Williams

(2009). Fitting measurement models to vocational interest data: Are dominance models ideal? Journal of Applied Psychology, 94, 1287–1304. doi:10.1037/a0015899

59.

Tay

Kuykendall

(2017). Why self-reports of happiness and sadness may not necessarily contradict bipolarity: A psychometric review and proposal. Emotion Review, 9, 146–154.

60.

Tay

Rounds

(2011). People–things and data–ideas: Bipolar dimensions? Journal of Counseling Psychology, 58, 424–440. doi:10.1037/a0023488

61.

Thurstone

L. L.

(1927a). A law of comparative judgment. Psychological Review, 34, 273–286.

62.

Thurstone

L. L.

(1927b). Psychophysical analysis. American Journal of Psychology, 38, 368–389.

63.

Thurstone

L. L.

(1928). Attitudes can be measured. American Journal of Psychology, 33, 529–554.

64.

Torgerson

W. S.

(1958). Theory and methods of scaling. New York, NY: Wiley.

65.

van Schuur

W. H.

Kiers

H. A. L.

(1994). Why factor analysis often is the incorrect model for analyzing bipolar concepts and what model to use instead. Applied Psychological Measurement, 18, 97–110.

66.

van Sonderen

Sanderman

Coyne

J. C.

(2013). Ineffectiveness of reverse wording of questionnaire items: Let’s learn from cows in the rain. PLOS ONE, 8, Article e68967. doi:10.1371/journal.pone.0068967

67.

Watson

Clark

L. A.

Tellegen

(1988). Development and validation of brief measures of positive and negative affect: The PANAS scales. Journal of Personality and Social Psychology, 54, 1063–1070.

68.

Wiese

C. W.

Tay

Duckworth

A. L.

D’Mello

Kuykendall

Hofmann

. . . Vohs

K. D.

(2018). Too much of a good thing? Exploring the inverted-U relationship between self-control and happiness. Journal of Personality, 86, 380–396. doi:10.1111/jopy.12322

69.

Wiggins

J. S.

(1982). Circumplex models of interpersonal behavior in clinical psychology. In Kendall

P. C.

Butcher

J. N.

(Eds.), Handbook of research methods in clinical psychology (pp. 183–221). New York, NY: Wiley.

70.

Wilson

M. R.

(2005). Constructing measures: An item response modeling approach. Mahwah, NJ: Erlbaum.

71.

Zhang

Noor

Savalei

(2016). Examining the effect of reverse worded items on the factor structure of the Need for Cognition scale. PLOS ONE, 11(6), Article e0157795. doi:10.1371/journal.pone.0157795

Establishing Construct Continua in Construct Validation: The Process of Continuum Specification

Abstract

Keywords

History of the Construct Continuum

Consequences of Continuum Neglect

Underdefined lower regions and poles

Reverse-worded items

Difficulty in interpreting distinctions and continuity in constructs

Continuum Specification: Description and Procedures

Continuum definition

Construct polarity

Specifying the nature of the gradation

Summary

Continuum operationalization

Generating items

Choosing response options and formats

Assessing dimensionality

Identifying and modeling the response process

Continuum Specification in the Context of Construct Validation

Conclusion

Footnotes

Acknowledgements

Action Editor

Author Contributions

Declaration of Conflicting Interests

Notes

References