Sage Journals: Discover world-class research

Abstract

General points

Give results quantitatively, preferably with 95% CI. Standard statistical tests do not need to be referenced, but the computer software should be. Statistical methods should be described in sufficient detail to permit analyses to be reproduced.

Presentation points

(1)

Unnecessary precision (for example, more than 2 significant figures), particularly in tables, is best avoided. Rounded figures are easier to compare and extra decimal places are rarely necessary.

(2)

Confidence intervals are preferred to P values, particularly for non-significant results. P values should be reported in full to 1 or 2 significant figures, unless results are highly significant in which case they may be reported as P < 0.001. Describing P values as >0.05 or NS (not significant) is best avoided.

(3)

If Gaussian distributed data are used it is desirable to specify the mean and SD and for other data to specify the median and 10th-90th centiles. If data are transformed the statistical analysis and calculation of summary statistics should be carried out on the transformed data and, where possible, it is preferable if the summary statistics are transformed back to the original scale for presentation.

(4)

If it is intended to compare different graphs, then it is useful if they are plotted on the same scales if possible. Care should be taken in selecting which variable is plotted on which axis (dependant variable on the vertical axis and independent variable on the horizontal axis).

Analysis points

(1)

Fisher's Exact Test is recommended for 2 × 2 contingency tables when the expected number of events in any cell is less than five (The expected number of events in a cell is calculated as the row total multiplied by the column total and divided by the total of the whole table).

(2)

If there are repeated measurements from several different patients then the correlated nature of this data needs to be taken into account. Often using an average measurement for each patient may be the simplest solution. Care is needed to avoid outliers overly influencing the results.

(3)

If it is intended to compare two different methods of measuring the same thing then the results would be expected to be highly correlated and correlation coefficients will not be sensitive to the degree of disagreement between the two methods. Plotting the differences between the two methods against the average of the two methods will be more informative and regression analysis can be used on these measures to determine the degree of agreement between the methods.¹

(4)

Multiple comparisons can produce spurious significant P values. Whenever possible the primary comparison should be pre-specified. If this is not possible then applying a single statistical test comparing all the groups first and examining the pairwise comparisons only if this is significant will help to avoid spurious significant P values.

(5)

If the paper is reporting the results of a clinical trial, the standards for the reporting of diagnostic accuracy studies (STARD) statement² or the Consolidated Standards of Reporting Trials (CONSORT) statement³ may be helpful.

(6)

Formal meta-analysis methods incorporating tests for heterogeneity are recommended when summarizing results from several studies.

References

Bland

, Altman

. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986; 1(8476):307-10

Bossuyt

, Reitsma

, Bruns

. for the STARD Group. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Clin Chem 2003;49:1-6

Moher D, Schultz KF, Altman DG for the CONSORT Group.

The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomization trials.

Lancet 2001; 357: 1191-4