Abstract
Background. The impact of age on rehabilitation outcome after traumatic brain injury (TBI) as measured by changes in the Functional Independence Measure (FIM) has been addressed in several seemingly conflicting reports. Differences may be explained by different study populations and different ways of analyzing data. Objective. To investigate the role of data analysis in the interpretation of the age effect on rehabilitation outcome after TBI by comparing classical analyses of the total FIM score with a new item-wise analysis that unfolds the comprehensive amount of information contained in the FIM measurement otherwise concealed by the total score. Methods. We analyzed admission and discharge FIM data from 411 consecutive TBI patients admitted to inpatient rehabilitation during 1998-2011 by both methods. Results. The classical analysis indicated similar rehabilitation outcome in the 18 to 39, 40 to 64, and 65+ years age groups, which could be explained by selection of strong elderly patients and/or methodological problems with classical data analyses, whereas the item-wise analysis demonstrated profound age effect on most FIM items throughout the age interval covered. Conclusions. The item-wise analysis meets requirements of proper data analysis, avoids concealing diversity in rehabilitation outcome behind the total FIM score, and provides a flexible, informative, and clinically relevant data analysis.
Introduction
Several studies have demonstrated that age is a strong predictor of mortality in acute care and for long-term morbidity after traumatic brain injury (TBI).1-5 Furthermore, multivariable analysis has identified clinical severity at admission, as measured by the Glasgow coma score (GCS), hypoxia, and hypotension, as a factor of importance to predict acute outcomes in individual TBI patients. 4 For patients who survive and proceed to rehabilitation, age may influence inpatient rehabilitation, and Cifu et al 1 found that older persons (>55 years) averaged a significantly longer rehabilitation length of stay, higher total rehabilitation charges, and a lower rate of change on the Functional Independence Measure (FIM). However, there is conflicting evidence about the impact of age on functional outcome during rehabilitation. Reeder et al 6 found in 365 TBI patients that age (range = 15-86 years; mean = 34 years) had no predictive value for functional outcome of inpatient rehabilitation after controlling for injury etiology, injury severity, and demographic factors. Functional outcome was measured by FIM and the Disability Rating Scale (DRS). In contrast, Graham et al 2 found in a large prospective case series (n = 18 413) that higher age (study population 65-95 years old) was associated with shorter lengths of stay, lower discharge FIM scores, and higher odds of home health services at discharge. Also, age seems to influence the long-term outcome of rehabilitation after TBI. Marquez de la Plata et al 7 found in a prospective cohort study that age (3 age groups 16-26, 27-39, and 40+ years) had a negative influence on functional outcome (FIM and DRS) 5 years post-TBI.
Possible explanations for the differences in results regarding age could be variability in outcome measures, selection bias, study populations, and confounders but also differences in data analysis methods. For example, Cifu et al 1 and Graham et al 2 considered quite different study populations with respect to age (mean 50 vs 79 years), and whereas Graham et al 2 studied all patients admitted for rehabilitation after TBI, Cifu et al 1 considered a sample where each elderly patient (>55 years) was matched with a younger patient on injury severity (number of days in coma, category of GCS, and abnormality of intracranial pressure) and gender, thereby counteracting the natural selection of strong elderly TBI survivors. Moreover, the 2 studies treated age quite differently in the data analysis: whereas Graham et al 2 treated the effect of age as a trend over the age range of 65 to 95 years, Cifu et al 1 treated the effect of age as a difference between age groups ≤55 years and >55 years. The nonsignificant effect of age reported by Reeder et al 6 may also have been influenced by the data analysis. Reeder et al 6 analyzed each outcome measure by a prioritized multiple regression, where each explanatory variable entered the regression model in a prioritized order after forced entry of all variables with higher priority. In these analyses, age was prioritized as number 10, and hence had a relatively small chance of appearing as statistically significant, with only 365 patients.
In the present study, we focus on the role of data analysis when assessing the effect of age on rehabilitation outcome after TBI as measured by FIM. The FIM instrument is a widely used rehabilitation outcome measure covering several functional domains but typically being reported as a total score or subdivided into a motor and a cognitive subscore. However, previous studies utilizing Rasch analysis8,9 question several items in the FIM instrument, thereby indicating limitations in the use of the total score. Moreover, the level of measurement 10 of the total FIM score is unclear. To be amenable for nonparametric statistical rank analysis, it has to be at least an ordinal measurement, and for standard multiple regression analysis, it has to be a normally distributed interval measurement. The total FIM score is, however, the sum of 18 ordinal measurements and, therefore, only an ordinal measurement itself under the assumption of interchangeability of the 18 items. By interchangeability, we mean that improvement of one FIM item and a corresponding aggravation on any other FIM item implies no change in the overall functional independence of the patient; but this is often not the case for the FIM score instrument, as clinicians using the FIM instrument will know. Likewise, the total FIM score is not an interval measurement. Indeed, besides identifying misfitting items, the statistical purpose of performing Rasch analysis 11 of the FIM score is to transform the raw sum score into an interval value to facilitate statistical analysis, hence fulfilling the requirements of standard multiple regression analysis, but this is usually not done. Rather, the raw sum score is analyzed by nonparametric methods or even parametric methods, assuming that it follows a normal distribution.
We present an item-wise analysis that applies standard statistical methods to FIM data in a way that does not violate the level of measurement and does not require a Rasch transformation of the total FIM score into the interval level of measurement. It is important to note that the item-wise analysis also facilitates valid adjustment for the effect of confounding factors. Moreover, the item-wise analysis unfolds detailed information about functional independence contained in FIM data, which may enhance clinical appeal of the data analysis.
Methods
Participants
Data from 411 consecutive TBI patients admitted to Hammel Neurocentre (HNC), Denmark, in 1998-2011 were analyzed. Admission and discharge information were collected from medical records, and FIM scores were obtained during rehabilitation. Demographic details and clinical characteristics at admission and discharge are presented in Table 1.
Demographic Details and Clinical Characteristics at Admission and Discharge in the 3 Age Groups. a
Abbreviations: ICU, intensive care unit; FIM, Functional Independence Measure; FIMin, total FIM score at admission; FIMout, total FIM score at discharge; ΔFIM, increase in total FIM score during rehabilitation; ΔFIMday, increase in total FIM score during rehabilitation per rehabilitation day; GCS, Glasgow Coma Scale; HS/MOD, highly specialized and intensive rehabilitation/moderate, less-specialized and less-intensive rehabilitation.
GCS at admission to acute care was only available for 210 patients (No. GCS). Data are presented as number (%) for categorical variables and compared between age groups by Fisher’s exact test. The remaining variables are presented as median/mean (range) and compared between age groups by the nonparametric Kruskal-Wallis test.
HNC is a rehabilitation hospital treating patients with acquired brain injury and has a background population of 2.9 million individuals. Patients with severe acquired brain injury, including TBI, are referred for inpatient rehabilitation in continuation of stays at intensive care units (ICUs) or departments of neurology or neurosurgery. Depending on clinical severity at admission, 2 different admission-types were offered: (1) a highly specialized and intensive rehabilitation, including physiotherapy and occupational therapy for all hours awake for the severely affected patients, and (2) a moderate, less-specialized and less-intensive rehabilitation, including physiotherapy and occupational therapy during daytime hours for the moderately affected patients.
Rehabilitation was delivered as a multidisciplinary inpatient program based on a team approach developed in the framework of the International Classification of Function (ICF) of the World Health Organization. The multidisciplinary approach included nursing care, occupational therapy, physiotherapy, and speech therapy as well as evaluations by neuropsychologists and neurosurgeons, with a neurologist or rehabilitation physician serving as team leader.
Inclusion and Exclusion Criteria
The inclusion criteria were (1) age >18 and (2) admission to HNC for inpatient rehabilitation after TBI in the period January 1998 to June 2011. Exclusion criteria were (1) duration from injury onset to admission more than 365 days; (2) discharged to nursing home or community dwelling, that is, referred to HNC after discharge from hospital treatment, and (3) interrupted rehabilitation for more than 4 weeks (as a result of severe complications).
Outcome Measures
Clinical evaluation was obtained using the FIM instrument, which contains 18 items covering 6 domains of functioning: activities of daily living, sphincter control, transfers, locomotion, communication, and social cognition. The score on each item ranges from 1 (dependent on total assistance) to 7 (complete independence). The graduation on each item is clinically further categorized into 3 levels: complete dependence (item score 1-2), modified dependence (item score 3-5), and independence (item score 6-7). Patients included in the present study were evaluated with the FIM instrument by certified and trained team members at admission, every fourth week, and at discharge. No follow-up data were available.
Statistical Analysis
The effect of age on rehabilitation outcome measured by FIM was analyzed in 2 fundamentally different ways: (1) traditional nonparametric and parametric statistical analyses of the total FIM and (2) the new item-wise analysis. In both cases, we used a 5% statistical significance level for testing the effect of age.
For the traditional analyses (1), patients were divided into 3 age groups (18-39, 40-64 and 65+), and rehabilitation outcome was quantified by the total FIM score at discharge (FIMout); the increase in total FIM score during rehabilitation (ΔFIM), that is, ΔFIM = FIMout − FIMin, where FIMin denotes the total FIM score at admission; and the increase in total FIM score during rehabilitation per rehabilitation day (ΔFIMday), that is, ΔFIMday = ΔFIM divided by the number of rehabilitation days. The 3 age groups were then compared on these outcome measures by the nonparametric Kruskal-Wallis test that does not facilitate adjustment for confounding factors. Hence, multiple regression was considered for comparing FIMout between age groups, with adjustment for the clinical severity at admission (FIMin) and other confounding factors.
Standard multiple regression requires that the dependent variable, FIMout, is an interval measurement and, in particular, that the residual variance is constant across all values of the independent variables and the model predictor. 12 Therefore, the assumption of variance homogeneity was investigated by plotting the estimated residuals (observed minus predicted FIMout) against the independent variable FIMin and model-predicted FIMout values to assess if the variability of the residuals was indeed independent of these quantities.
In the new item-wise analysis (2), we considered dichotomous outcome measures based on the 3 FIM item-wise graduation levels previous mentioned in the Methods section (complete dependence, modified dependence, and independence). To enhance clinical appeal of the data analysis and further exploit the large amount of detailed information of functioning concealed behind the total FIM score of 18 items, we decided to adopt the item-wise dichotomous outcome measures defined as item-wise independence (IWI, item score 6 or 7) or item-wise dependence (IWD, item scores 1-5) as suggested by Sendroy-Terrill et al. 13 The IWI outcome measures can be analyzed by logistic regression, thus in particular facilitating adjustment for the effect of influential confounding factors. Overall, 3 item-wise logistic regressions of IWI at discharge were performed, with a stepwise increasing degree of correction for confounders: (1) unadjusted comparison of age groups, (2) comparison of age groups with adjustment for independence status at admission and other confounding factors, and (3) analysis of age as an interval level variable with adjustment for independence status at admission and other confounding factors. The unadjusted logistic regressions (1) was considered for comparison with the Kruskal-Wallis analysis of the total FIM score. In the adjusted item-wise logistic regressions (2) and (3), the following factors were considered: age and the natural logarithm of the number of days in ICU as continuous variables, admission type and gender as categorical variables, categorical variables representing the item-specific status at admission, and counting variables representing the status at admission of the other 17 items. The item-specific status at admission was categorized in order to adhere to the ordinal level of measurement of the 18 FIM items. Specifically, we constructed for each item a categorical variable with categories “complete dependence” (item score = 1-2), “modified dependence” (item score = 3-5), and “independence” (item score = 6-7) representing the admission status of the item. Two additional counting variables representing the status of the other 17 items at admission were, respectively, the number of items with modified dependence and the number of items with independence at admission. The natural logarithm of the days in ICU was used to avoid extreme values to be overly influential. The fit of the item-wise logistic regression models was assessed by the Hosmer-Lemeshow goodness-of-fit tests, 14 the C-index, 15 and the generalized coefficient of determination, 16 R2. The Nagelkerke R2 is a generalization of the usual R2 known from standard linear regression, 12 where it represents the percentage of the variation in the dependent variable that is explained by variation in the independent variable through the linear model. The C index can be interpreted as the probability that a patient discharged IWI has a higher model-predicted probability of being IWI at discharge than a patient discharged IWD. 17
An important difference between the traditional analysis (1) and the new item-wise analysis (2) was the larger risk of false significances (statistical type I errors) with the latter because each traditional test of an age effect was replaced by 18 item-wise tests. Therefore, for each test of an age effect, the number of statistical significances obtained by the item-wise analysis was compared with the associated expected number of false significances (5% of 18—ie, 0.9).
Results
The 3 age groups did not differ significantly on total FIM at admission or discharge, nor on rehabilitation outcome as measured by ΔFIM and ΔFIMday (Table 1; Figure 1).

Total FIM scores at admission (FIMin) and discharge (FIMout) in the 3 age groups.
Moreover, they did not differ significantly in terms of gender, days in ICU, or rehabilitation days. Comparisons of admission type and GCS in acute care did, however, indicate some selection in the group of elderly people (65+ years old). The GCS at trauma admission was only available for 210 of 411 patients (51%) and biased in availability between age groups (P = .0210). The available GCS scores at trauma admission indicated a clinical selection in the group of elderly patients, with GCS in the 65+ age group being higher compared with that in the remaining patients (P < .0001)—namely, the 18 to 39 age group (P < .0001) and the 40 to 64 age group (P < .0001)—which indicated less-severe brain injuries and probably better prognosis among the elderly patients who were admitted for rehabilitation. The multiple regression analysis of FIMout against age group, FIMin, and other confounding factors (gender, days in ICU, and admission type) revealed severe violations of the assumption of variance homogeneity (Figure 2).

Observed minus predicted values of FIMout from the multiple regression analysis of FIMout against age group, FIMin, and confounding factors.
The residual variance in standard multiple regression is assumed to be independent of all independent variables but clearly decreased with increasing FIMin and predicted FIMout (Figure 2). The reason for this variance heterogeneity is essentially that the sample space of FIMout (the possible values) decreases with increasing FIMin (Figure 1), which is not modeled by standard multiple regression. This issue can also be observed in the predicted values of FIMout for large values of FIMin. The maximal value of FIMout is 126, but the model predicted values of FIMout exceed this maximum for large values of FIMin (Figure 2; 5% of the predicted FIMout values are larger than 126). Hence, the assumptions of the multiple regression analysis are compromised, and the results are, therefore, invalid and not reported here.
The unadjusted item-wise logistic regressions of IWI at discharge (see the Methods section), showed acceptable goodness of fit for all 18 items (Hosmer-Lemeshow P > .05 for all items) and a statistically significant difference between age groups for 2 items (“G. Bladder Management,” P = .0232 and “H. Bowel Management,” P = .0233; Figure 3). Post hoc comparison of age groups for these 2 items (Figure 3) showed that the primary difference was a better prognosis for the youngest patients compared with the oldest (“G. Bladder Management,” P = .0063; “H. Bowel Management,” P = .0071).

Item-wise between-age-group odds ratios for IWI at discharge. Two items exhibited statistically significant effect of age (G. Bladder Management, P = .0232; H. Bowel Management, P = .0233).
However, the C-index and the generalized coefficient of determination, R2, indicate considerable unexplained variation within the age groups,12,15 which may be further explained by including confounding factors in the analysis. Adjusting the item-wise logistic regressions of IWI at discharge for the effect of confounding factors (see the Methods section) maintained the acceptable goodness of fit for all 18 items but much improved values of the C-index and R2. Moreover, additional 7 items showed statistically significant differences between age groups (Figure 4). These were 6 motor items (A. Eating, P = .0296; B. Grooming, P = .0109; C. Bathing, P = .0126; D. Dressing—Upper Body, P = .0397; E. Dressing—Lower Body, P = .0177; F. Toileting, P = .0030) and a single cognitive item (P. Social Interaction, P = .0453).

Item-wise between-age-group odds ratios for IWI at discharge from the adjusted item-wise logistic regressions; 9 items exhibited statistically significant effect of age. These were 8 motor items (A. Eating, P = .0296; B. Grooming, P = .0109; C. Bathing, P = .0126; D. Dressing—Upper Body, P = .0397; E. Dressing—Lower Body, P = .0177; F. Toileting, P = .0030; G. Bladder Management, P = .0232; H. Bowel Management, P = .0233) and a single cognitive item (P. Social Interaction, P = .0453).
Overall, the post hoc comparisons of age groups for the 9 statistically significant items (Figure 4) suggested that the primary difference between the age groups was a better prognosis for the youngest patients. Although the adjusted analysis explains much of the within-age-group heterogeneity by means of confounding factors, patients were still age heterogeneous within age groups. This could be resolved by using age as an interval level variable in the item-wise logistic regressions, thus allowing age effects within age groups to be expressed (see the Methods section). Age could be used as a linear covariate (diagnostics not reported here) in the item-wise logistic regressions, and the analyses showed acceptable goodness of fit for all 18 items and similar values of C and R2. Testing the effect of age was now no longer a comparison of 3 groups but rather a test for an age trend throughout the covered age range. Nevertheless, a statistically significant effect of age was obtained for the same 9 items as above as well as 4 additional motor items (I. Bed, Chair, Wheelchair, P = .0116; J. Toilet, P = .0072; K. Tub, Shower, P = .0096; N. Comprehension, P = .0377) and a single cognitive item (R. Memory, P = .0179). The analyses suggested considerable effect of age within the age groups (Figure 5).

Item-wise odds ratios for IWI at discharge per 10 years of age (younger) derived from adjusted item-wise logistic regressions with age as an interval-level explanatory variable. Age was statistically significant for all but 2 motor (L. Walk/Wheelchair; M. Stairs) and 2 cognitive items (O. Expression; Q. Problem Solving).
For items with statistically significant effect of age, the medians across items of the odds ratio corresponding to the maximal age difference within each of the former age groups—that is, 21 years for the youngest, 24 years for the middle-aged, and 15 years for the oldest patients—were, respectively, 1.62 (range = 1.39-2.17), 1.74 (range = 1.45-2.42), and 1.41 (range = 1.26-1.74). For instance, within the group of middle-aged patients, the youngest (40 years old) had 45% to 142% higher odds for IWI at discharge relative to the oldest patients (64 years old) in the group.
Discussion
The Effect of Age
In the present study, the effect of age on rehabilitation outcome after TBI was analyzed using different methodologies. In the traditional analysis, patients were grouped according to age and compared between groups with changes in the total FIM score as the outcome measure. This analysis showed no statistical difference in rehabilitation outcome between age groups. However, data indicated that patients in the 65+ age group were subject to selection because they had a significantly higher GCS in acute care and a different distribution of admission types. The selection could reflect prehospital or acute clinical decisions regarding elderly patients with severe TBI and low GCS. It could also reflect the fact that more patients in the 65+ population succumb to a very severe TBI compared with younger patients, as reported by others. 4 Furthermore, the statistical power is influenced by the fact that the 65+ group is considerably smaller than the other age groups. Nevertheless, application of the traditional methodology demonstrates that TBI patients above the age of 65 years also benefit from highly specialized rehabilitation, and it fails to demonstrate inferiority with regard to the younger age groups.
In the new item-wise methodology, the concealed information from the individual FIM items is revealed, enabling a more detailed statistical analysis and potentially an enhancement of the clinical application of the FIM score. When using the item-wise methodology, the negative impact on rehabilitation outcome of increasing age is profound.
The Role of Data Analysis
The classical analysis of the total FIM score by nonparametric methods assumes that it is an ordinal measurement, which is questionable because it implies the assumption of interchangeability of the 18 items and thereby concealed diversity. For instance, a simultaneous change in score from 7 to 1 on item “H. Bowel Management” and conversely from 1 to 7 on item “M. Stairs” (or any other item) implies no change on the total FIM scale. Most total FIM scores actually represent a huge number of very different configurations on the 18 items. More than 1015 (a million billions) different configurations of the 18 FIM item scores are represented by only 109 possible values of the total FIM score. For example, a total FIM score of 18 implies an item score of 1 on all 18 items, but a total FIM score of 19 can be obtained in 18 different ways; many higher total FIM scores each represent millions of different configurations. Patient shifts between these equivalent configurations are nondetectable by the total FIM score and, therefore, also in the traditional nonparametric analysis of rehabilitation outcome measures based on the total FIM score. Rasch analysis assumes the same. It may transform the total FIM score into an interval measurement, but the analysis assumes that the total FIM score is sufficient for participants’ functional independence 11 —that is, those with the same total FIM score will receive the same interval measurement of functional independence.
Assuming that the total FIM score is an ordinal measurement, the Kruskal-Wallis comparison of the 3 age groups indicated no effect of age on rehabilitation outcome. This could be explained by selection of the admitted patients or perhaps by concealed diversity if age has an effect on rehabilitation outcome for some items but not for others, but it can also be explained by large differences between rehabilitation outcomes within age groups. This is usually dealt with in the statistical analysis by adjusting the comparison of age groups for the effect of confounding factors—for example, using a multiple regression analysis. A standard multiple regression analysis of FIMout requires additional assumptions about the total FIM score; that is, it must be an interval measurement and normally distributed. To achieve this is the fundamental purpose of the Rasch analysis, which is usually not done, but rather, the untransformed total FIM score is analyzed by standard multiple regression. Even if this violation of statistical assumptions of standard multiple regression is accepted, the analysis still severely violates fundamental assumptions—namely, the assumption of constant residual variance across all values of the independent variables (Figure 2). This is a result of the fact that the total FIM score has a limited sample space (integer values 18-126), implying that the sample space effectively shrinks to the empty set for increasing values of FIMin because most patients improve during rehabilitation (Figure 1). Hence, standard multiple regression analysis is invalid for the analysis of the total FIM score, and an alternative multiple regression analysis is needed. Moreover, a new analysis should avoid the assumption of interchangeability of the 18 FIM items and the implied concealed diversity.
The introduced item-wise analysis is a suggestion for such an analysis. By construction of both the IWI status and the FIM-based confounders, it respects the level of measurement of the FIM methodology, and it is a multiple regression technique that facilitates adjustment for the effect of confounding factors. Moreover, by analyzing the FIM scores item-wise it
avoids the problem known from Rasch analysis 11 of the total FIM score that misfitting items compromise the analysis of the other items;
avoids concealed diversity, as with the total FIM score; and
enhances clinical appeal of the analysis because information is provided separately for each item.
The second and third issue above could first be observed in the unadjusted item-wise analysis (Figure 3). Overall, the unadjusted item-wise analysis did not detect much effect of age (only 2 items with statistically significant effect of age, which is only slightly in excess of the expected number of false significances of 0.9) but was able to answer independently for the 18 items, suggesting that there might be an age effect on 2 items concerning bowel and bladder management. Evidently, it is possible that age has an effect on rehabilitation outcome on only some FIM items, and the item-wise analysis facilitates the expression of that. A high degree of heterogeneity between patients within age groups could be detected and quantified by the C index and R2, emphasizing the need for an analysis that can adjust the comparison for the effect of confounding factors. This was provided in the second item-wise analysis, where the comparison of age groups was adjusted for the effect of confounding factors (Figure 4). Including confounders could explain much of the difference between patients within groups regarding IWI at discharge, as expressed by the considerably improved values of the C index and R2, and the analysis facilitated a much clearer expression of the effect of age on the individual items compared with the unadjusted analysis (9 items with statistically significant effect of age was clearly in excess of the expected number of false significances of 0.9). In the third and final item-wise analysis using patients’ ages in the analysis rather than age groups (Figure 5), which is probably the most appropriate of the performed analyses, a strong effect of age on rehabilitation outcome could be demonstrated throughout the covered age range (18-80 years) and with statistical significance for 13 items, which was convincingly in excess of the expected 0.9 false significances.
The suggested item-wise analysis is a valid and flexible tool for the analysis of FIM scores and, potentially, other multiple item scores. Besides providing detailed information on each of the 18 items, the item-wise analysis provides information about the interplay between the FIM items by using all items as explanatory variables (confounders) in the analysis, which could be exploited further in a larger data set. Compared with traditional analyses based on the total FIM score, the item-wise analysis has an increased risk of false significances because of the larger number of statistical tests. Item-wise testing of an effect on all 18 FIM items implies that approximately 1 false significance can be expected, which must be considered in the interpretation of results. With larger data sets, additional dichotomous outcome measures could be analyzed, for example, by combining information from more than 1 item. Moreover, rather than using dichotomous outcome measures (IWI or IWD), the complete ordinal item-wise FIM scale could be used (proportional odds regression 12 ), and more detailed cross-item information could be included in the explanatory variables.
Footnotes
Acknowledgements
We thank the clinical staff at HNC for collecting data during 1998-2011.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
