Sage Journals: Discover world-class research

Abstract

Background

Scales for evaluating insomnia differ in number of items, response format, and result in different scores distributions and score ranges and may not facilitate meaningful comparisons.

Objectives

Transform ordinal item-scores of three scales of insomnia to continuous, equidistant, monotonic, normally distributed scores, avoiding limitations of summative scoring of Likert scales.

Methods

Equidistant item-scores by weighted sum using data-driven weights to different levels of different items, considering cell frequencies of Item-Levels matrix, followed by normalization and conversion to [1, 10]. Equivalent test-scores (as sum of transformed item- scores) for a pair of scales were found by Normal Probability curves. Empirical illustration given.

Results

Transformed test-scores are continuous, monotonic and followed Normal distribution with no outliers and tied scores. Such test-scores facilitate ranking, better classification and meaningful comparison of scales of different lengths and formats and finding equivalent score combinations of two scales. For a given value of transformed test-score of a scale, easy alternate method avoiding integration proposed to find equivalent scores of another scales. Equivalent scores of scales help to relate various cut-off scores of different scales and uniformity in interpretations. Integration of various scales of insomnia is achieved by finding one-to-one correspondence among the equivalent score of various scales with correlation over 0.99

Conclusion

Resultant test-scores facilitated undertaking analysis in parametric set up. Considering the theoretical advantages including meaningfulness of operations, better comparison, use of such method of transforming scores of Likert items/test is recommended test and items, Future studies were suggested.

Keywords

Likert items weighted sum Z-scores normal distribution equivalent scores

Introduction

A number of patient-reported scales are used for screening, assessing insomnia severity, monitoring progress, evaluating treatment efficiency, etc. to assist investigators and clinicians in evaluating multidimensional insomnia. Commonly used scales in this context include Insomnia Severity Index (ISI),¹ Pittsburgh Sleep Quality Index (PSQI),² Athens Insomnia Scale,³ etc. None of these scales gives a well-defined dichotomous outcome which more or less tallies with clinical interviews for epidemiological studies.⁴ Insomnia Symptom Questionnaire (ISQ)⁵ focuses on symptoms of insomnia which are based on DSM-IV and RDC criteria. High agreement of ISQ with clinical interviews tends to suggest better diagnosis of insomnia.⁶ However, most of the scales do not meet the specific needs of investigators conducting epidemiological or clinical trial studies.⁷ Major particulars of illustrative popular scales for Insomnia are given in Table 1.

Table 1.

Illustrative scales for insomnia.

Scale	Number and type of items	Range of score	Major purposes
ISI	7 – Likert items, each with 5 response categories marked as 0 to 4	0–28	To screen and assess degree of severity of both nighttime and daytime aspects of insomnia.⁸ Persons scoring $<$ 14 are taken as Normal and those scoring more than 14 to be considered as having insomnia.⁹
PSQI	Total 19-items. First four items are open. Each of Item 5–19 is in 4-point scale	0–21	To evaluate sleep quality and disturbances over the past month (a construct related to insomnia but broader than insomnia severity). A score > 5 and a score > 10 imply poor sleep quality respectively in depressed and normal sleepers and among people with insomnia.¹⁰ PSQI is not designed to assess insomnia based on diagnostic criteria.⁷
ISQ	13 – Items.Items 1–5 are 6-point from 0 to 5 and Item 6–13 are in 5-point scale from 0 to 4	Items 1, 2 and 5: to determine presence, frequency & duration of sleep symptom criteria. Items 6-13: to identify significant daytime consequences of the sleep complaints	Ascertains symptoms of insomnia that are based on DSM-IV and RDC criteria.¹⁰It does not assess sleep disturbance due to another sleep disorder, mental disorder, or substance or general medical condition (criteria C-E in DSM-IV for primary insomnia). The participants are not asked whether they had adequate opportunity to sleep (RDC criteria).

Each scale considers summative score which assumes (i) equal importance to all the items and (ii) satisfaction of equidistant property. Both the assumptions may not be justified.¹¹ The scales differ in terms of number of items, response format, etc. and result in different scores with different distribution and different score range and may not facilitate meaningful comparisons. Different scale formats distorted results of satisfaction surveys.¹² Test parameters like reliability, validity, and discriminating power indices are on lower levels for 2 point, 3-point, and 4-point scales from - the scales with higher number of levels.¹³ Most of the Insomnia assessing scales consider zero as an anchor value which may distorts mean, standard deviation (SD), skewness, kurtosis of scales.^14,15 Too many zero responses to an item lowers mean, variance of the item and covariance and correlation with that item. Techniques like Principal component analysis (PCA), Confirmatory Factor analysis (CFA), Structural equation modeling (SEM), etc. are sensitive to the characteristics of the data.¹⁶ Effect of numerical value attached to various response categories have been investigated.^17,18

Consider a situation where two scales (say ISI and PSQI) have been administered to the same sample. Different number of items (7 for ISI and 4 for PSQI, excluding the four open questions) and different number of response categories (5 for ISI and 4 for PSQI) will give different values of mean, SD for ISI and PSQI. In fact, distribution of ISI scores and PSQI score will be different and the two sets of scores may not be comparable. Higher number of response categories result in increased values of mean and SD.¹⁹ Thus, the different cut-off scores for ISI and PSQI may not be equivalent. In other words, set of persons with $<$ 14 in ISI scores may differ from the set of persons with PSQI score > 5 in depressed and normal sleepers, or > 10 among people with insomnia.

Equivalent score combinations of two scales $(X_{0}, Y_{0})$ can be defined as

\int_{- \infty}^{X_{0}} f (X) d x = \int_{- \infty}^{Y_{0}} g (Y) d y

(1)

where the score

X_{0}

in a scale with k-number of response categories with density function

f (X)

is equivalent to the score

Y_{0}

in a scale with (

k \pm c)

number of response categories with density function

g (Y)

for

c > 0

being a positive integer. Difficulties of approximating discrete scores by integrable continuous function

f (X)

g (Y)

along with the issue of goodness of fit can be avoided, if raw scores are transferred in such a fashion so as to ensure that transformed scores of the two scales follow normal distributions with same or different parameters. Condition (1) ensures that area under

f (X)

up to

X_{0}

= area under

g (Y)

up to

Y_{0}

. Normality of transformed scores will make it easy to find equivalent scores of two different scales.

The above motivates need to transfer the raw scores of self-reported scales of insomnia using Likert items with different number of response categories so that transformed item scores satisfy equidistant property and other desirable properties like continuous, monotonic data following normal distribution. Test scores of each scale as sum of such transformed item scores may be rescaled further to have a desired score range say 1 to 10 and facilitate finding equivalent score combinations $(X_{0}, Y_{0})$ satisfying the condition (1)

Rest of the paper is organized as follows. After literature survey, methodology of the proposed method and properties are described. The following section deals with empirical verifications to the suggested methods. The paper is rounded up by recalling the salient outcomes and emerging suggestions.

Literature survey

A simple way to equate a 5-point (1 to 5) item with a 7-point (1 to 7) item is to use linear transformation of levels where the extreme numbers coincide and all the intermediate options are given equally distanced numbers in between. In other words, a k-point scale is converted to (k + c)-point scale by multiplying each k-point score by $\frac{(K + c)}{k}$ i.e. proportional transformation. However, problems exist for such transformations. For example, if a 3-point item is converted to 5-point in this fashion i.e. equating 3 with 5, the converted item has only three positive values attached to response categories and strictly speaking cannot be taken as a 5-point item. The formula $\frac{(Score - 1)}{(N o . o f response categpries - 1)} \times 100$ to rescale to a common score out of 100 used.¹³ Linear transformation in the percentage of scale maximum used.^20,21 The converted scores and raw scores will have equal mean and SD, if Z-scores are used. However, Z-scores involving mean and SD for discrete ordinal Likert items may not be meaningful due to non-satisfaction of equidistant property.^22–24 Moreover, participants perceive Likert items as non-equidistant.²⁵

Large volume of literature dealt with conversion of scores of K-point scale to ( $K \pm c)$ -point scale showing intensity of associated problems.^26,27 Reduction in values of K by collapsing consecutive levels distorts covariance structures which in turn affects factor structure of the scale. It may be noted that equating is not forecasting and equating is symmetric²⁷ i.e. if $X_{0}$ . is equivalent to $Y_{0}$ i.e. $X_{0} ⟺ Y_{0}$ then $Y_{0} ⟺ X_{0}$ . Thus, equating method and forecasting method are different. Predicted value of 5-point scale by linear regression on 7-point scale and vice versa i.e. a pair of regression equations of the form $X_{7} = α_{1} + β_{1} X_{5}$ and $X_{5} = α_{2} + β_{2} X_{7}$ cannot be taken as equivalent scores.²⁸ Model driven IRT does not work well to detect individual changes for tests with less than 20 items.²⁹ Studies on equating scores considered several samples for smoothing of raw data to avoid irregularities and equating design for data collection. However, different smoothing methods and equating designs are there, each having its advantages and disadvantages. The present context demands a single sample who responded to a k-point and also (k $\pm c)$ -point Likert items of questionnaires and may not deal with issues relating to smoothing or equating design.

Instead of mapping the numerical values attached to the end points, attempts can be made to map or equate raw scores of k-point item to say (k $\pm c)$ -point item ensuring similarity in distributions say normal distribution of transformed scores. This will also facilitate undertaking statistical techniques used in the parametric set up.

Proposed method

Stage 1 – Modification of anchor values

Assign 1 – 5 to the response categories of 5-point Likert items instead of 0 – 4. This avoids the problems arising out of considering zero anchor value in items of various scales for insomnia. Moreover, consideration of zero as an anchor value does not allow taking expected value of an item which is defined as product of anchor values and corresponding probabilities of the response categories.

Stage 2 – Raw scores (X) to equidistant scores (E)

This involves transferring discrete ordinal raw scores of an item(X) (where response categories are marked as 1, 2, 3, …., k for a k-point scale) to continuous scores which are equidistant. Method of such conversion given by taking weighted sum where different weights are assigned to different response categories of different items.³⁰ Basic idea is, if response categories are marked as 1, 2, 3, 4 and 5 (for a 5-point item) and the finally selected weights for the i-th item are $W_{i 1}, W_{i 2}, W_{i 3}, W_{i 4}$ and $W_{i 5}$ then, $W_{i j} > 0$ , $\sum_{j = 1}^{5} W_{i j} = 1$ and $W_{1}, 2 W_{2}, 3 W_{3}, 4 W_{4}, 5 W_{5}$ forms an Arithemetic Progression (AP). The approach generates equidistant scores (E) which are continuous, monotonic (higher E-score indicates higher Insomnia severity) with a fixed zero point. In addition, such equidistant score avoid tied scores which are common in usual summative Likert scores and thus increases discriminating power of the test.

Stage 3 – Normalization

Normalize equidistant scores by

Z_{i j} = \frac{E_{i j} - Mean (E_{i})}{S D (E_{i})} \sim N (0, 1)

Here, $- \infty < Z_{i j} < \infty$ and test score as sum of normalized equidistant item-scores will also follow Normal distribution with zero mean and SD = $\sqrt{\sum Z_{i}^{2} + 2 \sum_{i \neq j} Cov (Z_{i}, Z_{j}})$ .

Stage 4 – Z-scores to Y in a desired score range

To avoid negative scores, convert the Z-scores to Y-scores so that Y has a fixed score range say

1 to 10 by using the following linear transformation:

Y = (10 - 1) [\frac{Z_{i j} - Min (Z_{i j})}{Max (Z_{i j}) - Min (Z_{i j})}] + 1

(2)

Note that test score as sum of normally distributed item-wise $Y$ -scores from (2) will also follow normal where variance will depend on correlations between Y-scores of pair of items. Positive valued $Y$ -score $\sim N (μ, σ^{2})$ with sample size n, helps in:

Assigning unique ranks to individuals

Computing sample mean ( $\bar{Y})$ and variance ( $S^{2})$ for a sample and estimating population mean $μ$ and variance $σ^{2}$ and testing $H_{0} : μ_{1} = μ_{2}$ or ${σ^{2}}_{1} = {σ^{2}}_{2}$

Computing 95% confidence limits of $μ$ as $\bar{Y} \pm 1.96 (\frac{σ}{\sqrt{n}})$ for large sample where

\bar{Y} \sim N (μ, \frac{σ^{2}}{n^{2}})

Fixed range of Y for each scale following Normal distribution helps to make ready comparisons and to find equivalent Y-scores of two scales using condition (1)

Meaningful comparison of scales of different length and different formats and facilitate parametric analysis after checking additional assumptions of the respective parametric techniques, other than normality which is already ensured.

Correlation between X and Y ( $r_{X Y})$ will be significant, since Y is obtained from X by weighted sum followed by linear transformations. Such linearity will ensure similar factor structures of a scale. However, if a test is not uni-dimensional, test reliability by Cronbach $α$ may not be suitable. Similarly, $r_{Y (Scale 1), Y (Scale 2)}$ will be almost same as $r_{X (Scale 1), X (Scale 2)}$

Classification

The total ISI score is interpreted as follows: absence of insomnia (0–7); sub-threshold insomnia (8–14); moderate insomnia (15–21); and severe insomnia (22–28).⁷ A cut-off score of 14 was recommended implying persons with score $<$ 14 are taken as Normal and those with score $>$ 14 to be taken as having Insomnia.³¹ Such cut-off score were used by researchers like.⁹ With the change of markings of the response categories, cut-off score of “No Insomnia” of ISI and PSQI now may be taken as 15 and 5 (in depressed and normal sleepers) respectively. Classification with cut-off score requires among others that persons in a class, should be alike i.e. variance of the class (within group) should be small and between group variance should be high.

Among various methods of classification aiming at partitioning the sample, Quartile clustering is more appealing for easy interpretation and distinct semantics.³² Advantages of quartile clustering with respect to Y $\sim N (μ, σ^{2})$ are:

Well-defined cut-off scores for the four classes.

Assigns equal probability to each quartile/class

Identifies outliers by those $< Q_{1}$ − 1.5 IQR or $> Q_{3}$ + 1.5 IQR where inter-quartile range IQR = $Q_{3} - Q_{1}$

Equivalent scores

If quartile cut-off X-scores of ISI are $X_{Q 1}, X_{Q 2}, X_{Q 3}$ and $X_{Q 4}$ and corresponding cut-off scores of transformed score (Y) are $Y_{Q 1 (ISI)}, Y_{Q 2 (ISI)}, Y_{Q 3 (ISI)},$ and $Y_{Q 4 (ISI)} .$ Let $Y_{ISI} \sim N (μ_{ISI}, {σ_{ISI}}^{2}) .$ Using condition (1), score of $Y_{Q i (ISI)}$ which is equivalent to $Y_{Q i (PSQI)}$ and $Y_{Q i (ISQ)}$ can be found where $Y_{PSQI} \sim N (μ_{PSQI}, {σ_{PSQI}}^{2})$ and $Y_{ISQ} \sim N (μ_{ISQ}, {σ_{ISQ}}^{2})$ . Alternately, if Y-scores (with zero ties) of each scale are arranged in increasing order, scores of the i-th person in the three scales i.e. $Y_{i (ISI)}, Y_{i (PSQI)} and Y_{i (ISQ)}$ will be equivalent since each represents same number of persons up to the score $Y_{i (j)},$ j= ISI, PSQI, ISQ i.e. area up to $Y_{i (j)} under the$ normal density function of the distribution of respective Y and ensure same relative position of in the sample. Correlation between equivalent scores of two scales will be almost perfect. Similar procedures of obtaining equivalent scores of the three scales may be adopted for percentile scores also.

$Y_{i (ISI)}$ corresponding to $X_{i}$ implying “No Insomnia” can be used for dichotomous outcomes of ISI and can also be extended to find $Y_{i (PSQI)} and Y_{i (ISQ)}$ which are equivalent to $Y_{i (ISI)} .$ Thus, equivalent scores of the scales with different number of items and response categories can be used for screening and also assessing insomnia severity without any assumptions of distribution of raw scores of observed/underlying variables.

Empirical illustration

Data used for empirical illustration of the proposed method and their properties are based on hypothetical data of 101 persons on 7- ISI items (k = 5); 15 PSQI items (k = 4) and 8 ISQ items (k = 5, excluding the five 6-point items).

Let $f_{i j}$ denotes frequency of the j-th response category of the i-th item. Equidistant scores based on $f_{i j}' s$ , assigned different weights to the response categories of different items. The raw scores and resultant equidistant scores are illustrated in Table 2 for 7-ISI items (5-point):

Table 2.

Raw scores and equidistant scores.

ISI – items	Item no.	Item scores					Common difference
Raw scores (X)	Each item	1	2	3	4	5	Unknown & Different
Corresponding equidistant scores(E)	1	0.182809	0.397258	0.611707	0.826156	1.040605	0.214449
	2	0.060995	0.377827	0.69466	1.011493	1.328326	0.316833
	3	0.080187	0.380889	0.68159	0.982292	1.282994	0.300702
	4	0.131031	0.388999	0.646966	0.904934	1.162902	0.257968
	5	0.165741	0.394535	0.62333	0.852124	1.080919	0.228795
	6	0.102076	0.38438	0.666684	0.948989	1.231293	0.282304
	7	0.150146	0.392048	0.63395	0.875852	1.117753	0.241902

Observations

E-scores are continuous and equidistant. However, common difference is different for different items.

Weights used for E-score are based on data driven empirical probabilities, obtained from the basic Item-Response category matrix

$f_{i j} = 0$ for a particular j-th level of an item, can be taken as zero value for scoring Likert items as weighted sum

E or Y had no tied scores. E and Y of 9 persons, each with X = 14 in ISI-scale are shown in Table 3.

Table 3.

Zero ties in equidistant score and Normalized score in [1, 10].

	Raw score in ISI scale								Equidistant score (E)	Normalized score in [1, 10] (Y)
Sl. no.	Item 1	Item 2	Item 3	Item 4	Item 5	Item 6	Item 7	Total (X)	Equidistant score (E)	Normalized score in [1, 10] (Y)
11	1	2	2	2	2	3	2	14	2.828848	41.42427
42	3	1	1	2	3	2	2	14	2.698082	41.98674
44	2	1	3	4	2	1	1	14	2.90011	43.07892
51	1	3	2	3	2	2	1	14	2.922993	42.45211
52	2	4	2	2	1	1	2	14	3.153273	44.05782
58	2	1	1	4	3	1	2	14	2.631927	41.89213
71	3	1	2	1	3	2	2	14	3.680913	48.43073
90	1	3	2	2	2	3	1	14	3.015016	42.63044
98	5	2	1	2	1	1	2	14	2.861159	43.15336
Mean								14	2.965813	43.23406
SD								0	0.309829	2.101258

Observations

For two different persons i and j, $E_{i} \neq E_{j}$ and $Y_{i} \neq Y_{j}$ even if $X_{i} = X_{j}$ =14

Raw score failed to discriminate among the persons with tied score. $\bar{E} \neq \bar{Y} \neq 14$ and SD(E)=0.32 and SD(Y)=2.10 of 9-persons with X = 14.

Descriptive statistics

Mean, variance, range, co-efficient of variation (CV= $\frac{S D}{Mean})$ and Outliers $> Q_{3} + 1.5 I QR$ of X, E and Y for different scales are shown in Table 4.

Table 4.

Descriptive statistics of test scores.

Description	Raw test score (X)	Equidistant scores (E)	Z-scores converted to [1, 10] (Y)
ISI – 7 items marked as 1 to 5
Mean	17.77228	3.75823	33.75803
Variance	26.15762	1.474143	115.3829
Observed range of test score	7–32	1.121599 to 7.294292	10.74031 to 63.97828
CV	0.287777	0.323062	0.318195
No. of outliers	3		Nil
PSQI – 15 items marked as 1 to 4
Mean	30.90099	8.050117	64.78584
Variance	64.0101	4.915427	438.8419
Observed range of test score	16–54	3.960579 to 14.11467	26.03665 to 117.8745
CV	0.258912	0.275409	0.323351
No. of outliers	1		Nil
ISQ – 8 items marked as 1 to 5
Mean	19.9703	4.116968	36.02666
Variance	34.30911	1.458891	152.7868
Observed range of test score	8–33	1.667254 to 6.689946	10.95747 to 62.13217
CV	0.293305	0.293382	0.343099
No. of outliers	Nil		Nil

Observations:

Mean, variance of X and Y differed for scales with different number of items and k.

E-score made the data homogeneous and reduced range of test score for each scale.

Y-score had higher mean and variance than the X-scores for each scale and will increase further if score range is increased to say [1, 100] from [1, 10]. Variance (Y) was highest for PSQI due to higher number of items; despite lesser value of k. High variance (Y) indicates positive correlations between pair of items in each scale.

CV(Y) i.e. SD per mean implying consistency, was marginally higher than CV (X) for each scale. CV(Y) fluctuated in an extremely narrow range, 0.32 (ISI) to 0.34 (ISQ). Almost equal CV indicates that number of response categories may not have much influence on variation about the mean for Y.

Y-score of a scale had no outliers. But X-score of ISI and PSQI had 3 and 1 outliers respectively

Unlike X and E, Y followed Normal distribution with estimated population parameters as follows:

Y_{ISI} \sim N (33.75803, {10.74164}^{2})

Y_{PSQI} \sim N (64.78584, {20.94855}^{2})

Y_{ISQ} \sim N (36.02666, {12.3607}^{2})

Correlations

Weighted sum to get E from X resulted in marginal deviation from perfect correlation between X and E. However, $r_{X Y} \approx r_{X E}$ due to linear transformations used to get Y from E. Empirically, $r_{X E} > 0.98 and r_{X Y} >$ 0.99 for ISI, PSQI and ISQ. High value of $r_{X Y}$ resulted in high correlation between a pair of scales and similar factor structure as shown in Table 5.

Table 5.

Correlation matrix between scales and number of independent factors.

	ISI	PSQI	ISQ	Independent factors (from PCA)
	ISI	PSQI	ISQ	No.	Cumulative variance explained (%)
ISI	1	0.752886 (0.737328)	0.946784 (0.932666)	2 (2)	48.839 (46.975)
PSQI		1	0.662935 (0.650895)	5 (5)	59.066 (56.941)
ISQ			1	3 (3)	61.081 (60.627)

Note: Figures without brackets and within brackets are respectively for X-scores and Y-scores.

None of ISI, PSQI and ISQ was uni-dimensional. Thus, test reliability by Cronbach alpha may not be suitable.

Classification

The sample was classified into four classes (Quartiles) based on X and Y separately for each scale. Results are given in Table 6.

Table 6.

Quartiles and boundary values.

	ISI		PSQI		ISQ
	X	Y	X	Y	X	Y
Q1	14	25.5947	25	47.38857	16	27.05627
Q2	18	33.59466	31	65.03262	21	37.83854
Q3	21	41.86345	37	78.63943	24	44.81802
Q4	32	63.97828	54	117.8745	33	62.13217
IQR= Q3–Q1	7	16.26875	11	31.25086	8	17.76175

Observations

Higher value of IQR for Y is due to higher spread of the Y.

Maximum IQR for Y of PSQI is due to maximum SD of $20.94855$

Quartiles of a scale considering X differed from the same with respect to Y. For example, for ISI, number of persons under $Q_{i}' s$ were 32, 20, 24 and 25 for X and 26, 25, 25 and 25 for Y for i= 1, 2, 3 and 4 respectively. Thus, normally distributed Y-score assigned same probability to each quartile unlike X-score.

Unique ranks to individuals in Y-score.

For ISI, cut-off score of X and corresponding cut-off score of Y were as follows:

X	14	18	21	32
Y	25.5947	33.59466	41.86345	63.97828

Equivalent scores

For a given value of $Y_{0 (ISI)}$ , the equivalent score $Y_{0 (PSQI)}$ and $Y_{0 (ISQ)}$ can be found using the Normal Probability Table to solve condition (1) so that $Y_{0 (ISI)} \Leftrightarrow Y_{0 (PSQI)} \Leftrightarrow Y_{0 (ISQ)}$

$Y_{PSQI}$ -score equivalent to $Y_{ISI}$ -score of 25.5947 was found by solving (3) for $Y_{0 P SQI (Q 1)}$

\int_{- \infty}^{25.5947} \frac{1}{10.74164 (\sqrt{2 π}} e^{- \frac{{(Y - 33.75803)}^{2}}{2 {(10.74164)}^{2}}} d y = \int_{- \infty}^{Y_{0 P SQI (Q 1)}} \frac{1}{20.94855 (\sqrt{2 π}} e^{- \frac{{(Y - 64.78584)}^{2}}{2 {(20.94855)}^{2}}} d y

(3)

Following similar procedure, equivalent scores for quartiles were found and shown in Table 7.

Table 7.

Equivalent test scores for quartiles.

Quartiles $Y_{ISI (Qi)}$ and Z-values	Area under $\int_{- \infty}^{Q_{i}} f (Y_{ISI (Q i)}) d y$	Equivalent scores
Quartiles $Y_{ISI (Qi)}$ and Z-values	Area under $\int_{- \infty}^{Q_{i}} f (Y_{ISI (Q i)}) d y$	$Y_{PSQI (Q_{i})}$	$Y_{ISQ (Q_{i})}$
Q1 = 25.5947 Z= $\frac{(25.5947 - 33.75803)}{10.74164}$ = $-$ 0.75997	$\int_{- \infty}^{25.5947} f (Y_{ISI (Q 1)}) d y$ = $\int_{- \infty}^{- 0.75997} Z$ dz = 0.5 − $\int_{0}^{0.75997}$ Z dz = 0.5 – 0.2764 = 0.2236	For $Z = \frac{(Y - 64.78584)}{20.94855}$ , $\int_{- \infty}^{\frac{({PSQI}_{Q 1} - 64.78584)}{20.94855}}$ Z dz = 0.5−0.2764 $⟹$ $-$ 0.76 $\frac{({PSQI}_{Q 1} - 64.78584)}{20.94855} =$ $- 0.76 ⟹ {PSQI}_{Q 1} =$ 48.86494	$\int_{- \infty}^{\frac{({ISQ}_{Q 1} - 36.02666)}{12.3607}}$ Z dz = 0.2236 $⟹ {ISQ}_{Q 1} =$ 28.73385
Q2 = 33.5947Z= $\frac{(33.5947 - 33.75803)}{10.74164}$ = $-$ 0.01521	$\int_{- \infty}^{33.5947} f (Y_{ISI (Q 2)}) d y$ = $\int_{- \infty}^{- 0.01521} Z$ dz= 0.5 – $\int_{0}^{0.01521}$ Z dz= 0.5 – 0.0596 = 0.4848	$\int_{- \infty}^{\frac{({PSQI}_{Q 2} - 64.78584)}{20.94855}} Z$ dz = 0.4848 $⟹ \frac{({PSQI}_{Q 2} - 64.78584)}{20.94855} =$ $- 0.15 {PSQI}_{Q 2} =$ 61.64356	$\int_{- \infty}^{\frac{({ISQ}_{Q 2} - 36.02666)}{12.3607}} Z$ dz =0.4848 $⟹$ ${ISQ}_{Q 2} =$ $- (2.16) . (12.3607) +$ $36.02666 =$ 62.72577
Q3 = 41.8635Z= $\frac{(41.8635 - 33.75803)}{10.74164}$ = 0.754579	$\int_{- \infty}^{41.8635} f (Y_{ISI (Q 3)}) d y$ = $\int_{- \infty}^{0.754579} Z$ dz = 0.5 + $\int_{0}^{0.254579}$ Z dz = 0.5 + 0.1026 = 0.6026	$\int_{- \infty}^{\frac{({PSQI}_{Q 3} - 64.78584)}{20.94855}} Z$ dz =0.6026 $⟹ {PSQI}_{Q 3} =$ 77.4094	$\int_{- \infty}^{\frac{({ISQ}_{Q 3} - 36.02666)}{12.3607}} Z$ dz=0.6026 $⟹ {ISQ}_{Q 3} =$ (0.5517)(12.3607)+ $36.02666 =$ 42.84606
Q4 = 63.9783Z= $\frac{(63.9783 - 33.75803)}{10.74164}$ = 2.813374	$\int_{- \infty}^{63.9783} f (Y_{ISI (Q 3)}) d y$ = $\int_{- \infty}^{2.81337}$ Z dz = 0.5 + $\int_{0}^{2.313374}$ Z dz = 0.5 + 0.4896 = 0.9896	$\int_{- \infty}^{\frac{({PSQI}_{Q 4} - 64.78584)}{20.94855}} Z$ dz $⟹$ 0.9896 $⟹ {PSQI}_{Q 4} =$ 85.516525	$\int_{- \infty}^{\frac{({ISQ}_{Q 4} - 36.02666)}{12.3607}} Z$ dz= o.9896 $⟹$ ${ISQ}_{Q 4} =$ 62.72577

Clearly, equivalent quartile score of $Y_{0 (PSQI)}$ and $Y_{0 (ISQ)}$ corresponding to $Y_{0 (ISI)}$ using the Normal probability table indicate $Y_{0 (ISI)} < Y_{0 (ISQ)} < Y_{0 (PSQI)} .$

The above table showing equivalent scores of the three scales are illustrative. Each Y-score of a scale following normal distribution has one-to-one correspondences with equivalent scores of other scales following Normal with different parameters. In other words, for the i-th individual, $Y_{i (ISI)}$ corresponds uniquely to $Y_{i (PSQI)} and also to Y_{i (ISQ)}$ .

Y-scores (with zero ties) of each scale was arranged in increasing order as per alternate easier approach avoiding integration and probability table of N (0, 1). Scores of the i-th person of the scales i.e. $Y_{i (ISI)}, Y_{i (PSQI)} and Y_{i (ISQ)}$ are equivalent since it represents same number of persons up to the score in the i-th row i.e. area up to $Y_{i (ISI)} under the$ normal density function of the distribution of respective Y. Correlation of equivalent scores exceeded 0.99.

Conclusions

The paper proposes a method of transforming discrete raw scores ( $X_{i}$ ) of i-th item of each of the three scales of insomnia to $Y_{i}$ where $1 \leq Y_{i} \leq 10$ is continuous, equidistant, monotonic, normally distributed without outliers and tied scores. The transformation procedure involving a single sample takes care of different number of items and different number of response categories of three scales of insomnia. Transformed test-scores for ISI ( $Y_{ISI}$ ) is the sum of transformed item-score of the ISI scale and follows normal distribution, $Y_{PSQI}$ and $Y_{ISQ}$ were similarly obtained. Different parameters of distribution of Y-score of the scales were estimated from the data. Equivalent score combinations of two scales ( $Y_{0 (ISI)}, Y_{0 (PSQI)})$ were found so that area under $f (Y_{ISI})$ up to $Y_{0 (ISI)}$ = area under $g (Y_{PSQI})$ up to $Y_{0 (PSQI)}$ where $f (Y_{ISI})$ and $g (Y_{PSQI})$ denote respectively the normal density function of $Y_{ISI}$ and $Y_{PSQI}$ .

Alternate easier approach of arranging Y-scores of each scale in increasing order to find quickly equivalent scores, established $Y_{i (ISI)} < Y_{i (ISQ)} < Y_{i (PSQI)} \forall i$ due to $\bar{Y_{ISI}} < \bar{Y_{ISQ}} < \bar{Y_{PSQI}}$ .

Correlation between a pair of such equivalent scores exceeded 0.99.

$Y_{i (ISI)}$ corresponding to cut-off score $X_{i}$ implying “No Insomnia” can be used for dichotomous outcomes of ISI and can be extended to PSQI and ISQ by finding $Y_{i (PSQI)} and Y_{i (ISQ)}$ which are equivalent to $Y_{i (ISI)} .$ Thus, equivalent scores of the scales can be used for screening and also assessing insomnia severity without any assumptions of distribution of raw scores of observed/underlying variables. However, such cut-off scores may be validated against clinical interviews. Similar interpretation may be derived for each cut-off score for sub-threshold insomnia, moderate insomnia and severe insomnia along with computation of Prevalence Index as percentage of individuals above $Y_{i (ISI)} \Leftrightarrow Y_{i (PSQI)} \Leftrightarrow Y_{i (ISQ)}$ .

Y-score can be used for better ranking and classifying the individuals in the sample with respect to insomnia severity. Moreover, normality helps to undertake statistical analysis under parametric set up. The proposed approach of integrating Likert items with different number of response categories for different insomnia scales are critically relevant to practitioners and researchers in social sciences in general and epidemiological research in particular. Use of such methods of integrating several insomnia scales using Likert items is recommended for clear theoretical advantages and easiness in calculations. Future studies may be undertaken with multi-data sets along with issues relating to reliability and validity of the proposed transformation.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Satyendra Nath Chakrabartty

References

Bastien

Vallières

Morin

CM.

Validation of the insomnia severity index as an outcome measure for insomnia research. Sleep Med 2001; 2: 297–307.

Buysse

Reynolds

Monk

, et al. The Pittsburgh sleep quality index: a new instrument for psychiatric practice and research. Psychiatry Res 1989; 28: 193–213.

Soldatos

Dikeos

Paparrigopoulos

TJ.

Athens insomnia scale: validation of an instrument based on ICD-10 criteria. J Psychosom Res 2000; 48: 555–560.

Moul

Hall

Pilkonis

, et al. Self-report measures of insomnia in adults: rationales, choices and needs. Sleep Med Rev 2004; 8: 177–198.

Spielman

Saskin

Thorpy

MJ.

Treatment of chronic insomnia by restriction of time in bed. Sleep 1987; 10: 45–56.

Okun

Kiewra

Luther

, et al. Sleep disturbances in depressed and nondepressed pregnant women. Depress Anxiety 2011; 28: 676–685.

Morin

Belleville

Bélanger

, et al. The insomnia severity index: psychometric indicators to detect insomnia cases and evaluate treatment response. Sleep 2011; 34: 601– 608.

Bluestein

Rutledge

Healey

AC.

Psychosocial correlates of insomnia severity in primary care. J Am Board Fam Med 2010; 23: 204–211.

Chahoud

Chahine

Salameh

, et al. Reliability, factor analysis and internal consistency calculation of the insomnia severity index (ISI) in French and in English among Lebanese adolescents. e Neurological Sci 2017; 7: 9–14.

10.

Okun

Kravitz

Sowers

, et al. Psychometric evaluation of the insomnia symptom questionnaire: a self-report measure to identify chronic insomnia. J Clin Sleep Med 2009; 05: 41–51.

11.

Lee

Soutar

Is Schwartz’s value survey an interval scale and does it really matter?

J Cross-Cultural Psychol 2010; 41: 76–86.

12.

Dawes

Do data characteristics change according to the number of scale points used? An experiment using 5-point, 7-point and 10-point scales. Int J Market Res 2008; 50: 61–77.

13.

Preston

Colman

AM.

Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent preferences. Acta Psychol (Amst) 2000; 104: 1–15.

14.

Dawes

JG.

Five point vs eleven point scales: does it make a difference to data characteristics?

Australas J Market Res 2002; 10: 39–47.

15.

Johnson

Smith

Tucker

Response format of the job descriptive index: assessment of reliability and validity by the multitrait-multimethod matrix. J Appl Psychol 1982; 67: 500–505.

16.

Bentler

PM.

EQS structural equations program manual. Encino, CA: Multivariate Software Inc., 1995.

17.

Mazaheri

Theuns

Effects of varying response formats on self-ratings of life-satisfaction. Soc Indic Res 2009; 90: 381–395.

18.

Sangster RL, Willits FK, Saltiel J, Lorenz FO and Rockwood TH. The effects of numerical labels on response scales. Paper presented at the annual meeting of the Survey Research Methods Section of the American Statistical Association, August, 2001, Atlanta, GA.

19.

Finn

RH.

Effects of some variations in rating scale characteristics on the means and reliabilities of ratings. Educ Psychol Measure 1972; 32: 255–265.

20.

Cummins

RA.

The comprehensive quality of life scale — intellectual/cognitive disability, (ComQol-I5). 5th ed. Melbourne: School of Psychology, Deakin University, 1997.

21.

Cummins

RA.

Normative life satisfaction: measurement issues and homeostatic model. Soc Indicators Res 2003; 64: 225–240.

22.

C-H.

An empirical study on the transformation of likert scale data to numerical scores. Appl Math Sci 2007; 1: 2851–2862.

23.

Ferrando

PJ.

A kernel density analysis of continuous typical-response scales. Educ Psychol Measure 2003; 63: 809–824.

24.

Hand

DJ.

Statistics and the theory of measurement. J R Statist Soc A 1996; 159: 445–492.

25.

Bendixen

Sandler

Converting verbal scales to interval scales using correspondence analysis. Manage Dyn: Contemp Res 1995; 4: 32–50.

26.

Mertler

CA.

Using standardized test data to guide instruction and intervention. College Park, MD: ERIC Clearinghouse on Assessment and Evaluation, 2002. (ERIC Document Reproduction Service No. ED470589

27.

Livingston

SA.

Equating test scores (without IRT). Princeton, NJ: ETS, 2004.

28.

Colman

Norris

Preston

CC.

Comparing rating scales of different lengths: equivalence of scores from 5-point and 7-point scales. Psychol Rep 1997; 80: 355–362.

29.

Jabrayilov

Emons

WHM

Sijtsma

Comparison of classical test theory and item response theory in individual change assessment. Appl Psychol Meas 2016; 40: 559–572.

30.

Chakrabartty

and Nath. Limitations of insomnia severity index and possible remedies. JSM Neurol Disorders Stroke 2019; 5: 1–9.

31.

Kaczor

Skalski

Prevalence and consequences of insomnia in pediatric population. Psychiatr Pol 2016; 50: 555–569.

32.

Goswami

Chakrabarti

Quartile clustering: a quartile based technique for generating meaningful clusters. Jr. of Computing 2012; 4: 48–55.

Integration of various scales for measurement of insomnia

Abstract

Background

Objectives

Methods

Results

Conclusion

Keywords

Introduction

Literature survey

Proposed method

Stage 1 – Modification of anchor values

Stage 2 – Raw scores (X) to equidistant scores (E)

Stage 3 – Normalization

Stage 4 – Z-scores to Y in a desired score range

Classification

Equivalent scores

Empirical illustration

Observations

Observations

Descriptive statistics

Observations:

Correlations

Classification

Observations

Equivalent scores

Conclusions

Footnotes

Declaration of Conflicting Interests

Funding

ORCID iD

References