Abstract

The papers by Meares, Stevenson and Comerford in the Journal [1,2] together with the associated comments [3,4] address an important topic: effective care for borderline patients.
The evaluation of the effectiveness of care requires rigour in the design, conduct and reporting of studies. My concerns are about the methodology and reports of the studies presented in the Journal [1,2].
The most important question that needs to be asked of such an analytic study is ‘Does the study reject the null hypothesis?’ That there is no difference in outcome between active treatment (long-term psychotherapy) and the alternative. Is the methodology sufficiently robust to conclude that the reported results could be used with similar effectiveness with my patients and have achieved the same result?
The gold standard for therapeutic trials is a randomised, controlled trial; it was not done. If not done, how did the authors guard against possible biases when an analytical controlled trial was undertaken? The authors said that random assignment to a waiting list was not possible.
Waiting lists exist. Is it ethical to offer an efficacious treatment if it cannot be provided and instead ask patients to remain on a 12-month waiting list? More importantly, the trial is attempting to prove that the treatment offered is efficacious. Others might argue that to do less than an optimal study is not ethical. Still, if the authors knew that a waiting list would exist, and that some might remain on it, then at least random allocation to a waiting list might be fairer than the alternative used. However, the study suggests another problem. The actively treated group and the control group were not contemporaneous. If they were not, and the control group was referred to the clinic at a later time than those offered active treatment, were the cohorts similar at baseline?
The authors show that they were different; their mean DSM scores differed. When randomisation cannot be done authors reporting results attempt to show in the analytic phase that the groups would have had similar trajectories in the natural history of their borderline personality disorder (BPD). Apart from the fact that the groups met the criteria for diagnosis, how may they have differed qualitatively? For example, was their degree of self-harming behaviour similar. The data presented at baseline such as age, sex, marital status were said to be no different. However, twice the number on the waiting list group were employed, and stability at work and in relationships may be important prognostically. What the authors have not presented are the differences or their absence in prognostic factors at baseline that may have affected patient outcome.
Is it also possible that there was significant measurement bias. The authors state that psychiatrist and research psychologist initially assessed the psychotherapy group; were the control group similarly measured by the same team? I believe that classifications systems using a 27-point scale may be reliable. The authors, however, present no evidence for their scale's reliability and reproducibility. Significantly, no mention is made that assessors were blind to the identity of the treatment or control group when baseline and follow-up measurements were done. Could the researchers' knowledge and involvement in the treatment program have affected their objectivity? Might the patients treated enthusiastically by the trainees not want to disappoint their doctors (assessors)?
In randomised, controlled trials outcome analysis is on an intention-to-treat basis: less losses after randomisation systematically bias in favour of outcome. In the authors' papers only 30 of 48 (63%) selected to receive this form of psychotherapy were included in analysis. Eight dropped out and seven continued treatment. If it assumed that the seven who continued treatment, and the eight who dropped out and the three lost to follow-up remained borderline then only 19% of those offered such therapy had lost their diagnostic category. At worst, efficacy was thus halved; at best, it would reduce the clinical relevance of treatment. In this study their technique would not appear to be potentially useful to up to a third of subjects.
From an analytical perspective the choice of a multiple regression analysis begs the question, ‘Did the dependent variable (change in score), predicted score and errors of prediction, the residuals, satisfy the assumptions of normality, linearity and homeoscadisticity?’ If not, a non-parametric regression analysis would have been required. No information is provided on this point.
In summary, I am unconvinced that the authors' choice of methods or presentation of data have proved their case and I wish it were different.
The second paper rather more straightforwardly looks at cost data. However the conclusion reached by Allen [3] that ‘the… Group have admirably demonstrated an effective (and cost-effective) treatment’ is not shared by me. By the authors' own admission, their cohort consisted of two groups of utilisers: high and low. The majority, 17 (57%) low utilisers. Their average cost prior to therapy was $3051 reduced to $286. Thus, in this group of patients the cost of treatment when savings are deducted was approximately $174 000, thus therapy did not save, but instead cost the tax payer handsomely. It also means that the cost data in this sample is highly skewed. Thus, the savings in the high utiliser group to make up for this shortfall must have been $564 000. Thus, in a year, a group of young BPD persons consumed a very large amount of public funding. Therefore, if the efficacy study results were true, a concentration of the Westmead groups resources on the high utilisers would reduce their workload, and thus the waiting list, by half and double their cost-effectiveness.
