Abstract
Clinical research analyses must balance the desire to `learn all that is learnable' from the database with the observation that sample-based data commonly lead to conclusions that are perfectly correct for the sample, but wholly incorrect for the population from which the data were based.
Investigators who defend exploratory analyses as reliable, misuse important tools that have taken over three hundred years to develop. Statistical estimators in clinical trials function appropriately when they incorporate random data that is gathered in response to a “xed research question. Their prediction ability degrades rapidly when the selection of the research question is itself random, that is, left to the data. Operating like blind guides, these estimators mislead the medical community about what it would see in the population, based on sample observations. The result is a wavering research focus, leaping from one provocative but misleading “nding to the next on the powerful waves of sampling error. Therefore, a primary purpose of the prospective design is to “x the research questions prospectively, thereby anchoring the analysis plan.
Prospective statements of the research questions and rejection of tempting databased changes to the protocol preserve the best estimates of effect sizes, standard errors, con“dence intervals and p-values. Embracing these principles promotes the prosecution of a successful research program, that is, the construction and protection of a research environment that permits an objective assessment of the therapy or exposure being studied. If there is any “xed star in the research constellation, it is that sample-based research must be hypothesis-driven and concordantly executed to have real meaning for both the scienti“c community and the patient populations that we serve.
Get full access to this article
View all access options for this article.
