Abstract
Observational studies are the basis for much of our knowledge of veterinary pathology and are highly relevant to the daily practice of pathology. However, recommendations for conducting pathology-based observational studies are not readily available. In part 1 of this series, we offer advice on planning and conducting an observational study with examples from the veterinary pathology literature. Investigators should recognize the importance of creativity, insight, and innovation in devising studies that solve problems and fill important gaps in knowledge. Studies should focus on specific and testable hypotheses, questions, or objectives. The methodology is developed to support these goals. We consider the merits and limitations of different types of analytic and descriptive studies, as well as of prospective vs retrospective enrollment. Investigators should define clear inclusion and exclusion criteria and select adequate numbers of study subjects, including careful selection of the most appropriate controls. Studies of causality must consider the temporal relationships between variables and the advantages of measuring incident cases rather than prevalent cases. Investigators must consider unique aspects of studies based on archived laboratory case material and take particular care to consider and mitigate the potential for selection bias and information bias. We close by discussing approaches to adding value and impact to observational studies. Part 2 of the series focuses on methodology and validation of methods.
Keywords
Observational studies are the foundation for most of the current knowledge that veterinary pathologists apply to their daily practice. The published literature contains considerable advice on designing and reporting observational studies, including the recent STROBE-Vet guidelines. 31,39 However, these publications are oriented to epidemiology and often focus on studies of causation, whereas pathology-based studies more often investigate mechanisms or consequences of disease. Moreover, investigations based on archived laboratory case material have unique caveats and limitations that should be recognized in the early phases of study design.
Here, editors and editorial board members of Veterinary Pathology and our colleagues present the sequential steps in devising and conducting observational studies in veterinary pathology. We also provide examples from published articles for clarity. This article is not intended as a list of requirements to publish in Veterinary Pathology because application of these principles will depend on the study context. Instead, the article describes principles intended to stimulate thinking on effective study design.
This article—the first of a 2-part series—focuses on design and development of observational studies. We discuss devising the study, developing the rationale, and forming a specific hypothesis, question, or objective. Next, we consider the details of study design: choosing between descriptive and analytic studies, types of analytic studies, prospective vs retrospective enrollment, study design considerations that pertain to causal inferences, selection and numbers of subjects for the study, and issues of bias, confounding, and chance associations. Finally, we consider the need for careful critique of the study design and approaches to adding value and rigor. The second article of the series 8 addresses methodology and validation of methods.
We should clarify a few terms. Study subjects are the individuals being studied, such as the cases and controls. Studies of causal association measure an exposure and an outcome. The exposure (independent variable) is presumed to precede the outcome (dependent variable). Depending on the study design, the disease could be the exposure or the outcome. For example, a virus infection could be the exposure and pneumonia is the outcome, or pneumonia could be the exposure and serum fibrinogen levels are the outcome.
Various study types, as defined in Fig. 1, can be considered when investigating the hypothesis that panleukopenia virus causes restrictive cardiomyopathy in cats. 29 Panleukopenia virus infection is the exposure, and development of restrictive cardiomyopathy is the outcome. In an experimental study, the exposure is manipulated: cats could be challenged with virus or saline control to determine the effect on development of restrictive cardiomyopathy. In contrast, an observational study would investigate a population of cats without controlling the exposure. Observational studies come in 2 flavors: descriptive and analytic. A descriptive study could report 1 or more cases of restrictive cardiomyopathy and indicate how many had evidence of panleukopenia virus infection. Or, a descriptive study could report on cats with natural panleukopenia virus infection, mentioning the number that had concurrent restrictive cardiomyopathy. In contrast, an analytic study compares 2 groups, such as reporting the frequency of panleukopenia virus infection in cats with restrictive cardiomyopathy and in cats without restrictive cardiomyopathy.

In an experimental study, the exposure (independent variable) is controlled and manipulated by the investigator. The 3 classic observational study designs differ in whether exposure or outcome defines how study subjects are selected. In cross-sectional studies, study subjects are selected without regard for either the exposure or the outcome, and the outcome and exposure are measured at the same time. In case-control studies, study subjects are selected based on the outcome, and the exposure is compared between groups with differing outcomes. In cohort studies, study subjects known to be free of the outcome are selected based on their exposure to the putative causal factor, then followed over time; development of the outcome is compared in study subjects with differing exposures. Examples of analytic studies are provided in Table 3. It is notable that comparisons of diseased and healthy animals (often termed cases and controls by veterinary pathologists) are case-control studies only if subjects are selected based on their disease status and compared with respect to their exposure to a putative causal factor.
Experimental studies sit proudly atop the hierarchy of evidence because exposures can be precisely controlled. But, let us not abandon our respect for observational studies! Observational studies investigate the very animals that comprise pathologists’ routine caseloads and are therefore highly relevant to daily practice. Observational studies are essential when experimental studies are impossible or undesirable. They are often easier and less expensive to carry out because study subjects and data may already be available or be more easily obtained and are well suited to the analysis of conditions that develop over a long period of time. Many risk factors or outcomes can be investigated simultaneously, including interactions among variables. Observational studies usually contribute an early foundation of knowledge before it becomes possible—if ever—to study the disease experimentally. Thus, observational studies are the most frequent type published within the pages of Veterinary Pathology (Fig. 2), so it is prudent to optimize the design of these studies, as we continue to welcome them as a key basis for knowledge in veterinary pathology.

Numbers of observational studies (analytic and descriptive) and experimental studies published in Veterinary Pathology. Most published articles are observational studies, and most of these are descriptive.
Devising an Observational Study
This earliest step in the study—choosing a topic—shapes its eventual impact. We suggest the following formula for devising observational studies that will have value: Identify important problems and gaps in knowledge, and work toward solutions for them. Have an innovative mind-set, being open to and actively searching for new possibilities. Consider observations that don’t fit with existing knowledge and what they might mean for alternative understanding. Consider alternative interpretations of existing observations and what might be done to evaluate differing explanations. Use the scientific method: observations, experiences, knowledge→ clearly formulate a question or identify a problem→ create a hypothesis→ design and conduct an observational study→ critically analyze the results, their inferences, and implications→ (communicate findings)→ refine questions/hypotheses and repeat. Apply novel methods to existing problems if they open new areas of investigation. Novel methods are not enough by themselves; they must lead to new and meaningful knowledge. But innovative methodologies can offer new ways of probing old problems, a key that opens a previously locked door. Throughout this process, recognize the essential role of creativity. A study is dull and meaningless without the imaginative insights and ideas that have been termed the creative, aha, or eureka moments; the happy thought; or the art of discovery. When unexpected but seemingly valid results emerge, resist the tendency to force them into the mold of prior thinking. Exciting advances in knowledge are based on troublesome and unanticipated findings. Let the data speak.
Most studies take unexpected twists and turns as investigators encounter and overcome challenges and as surprising findings emerge. The initial plan will be modified accordingly: research is an iterative process that requires reflection and critical analysis at each stage of the study (Table 1).
Questions to Revisit at Each Stage of the Study.
Creating the Hypothesis, Question, or Objective
The hypothesis, question, or objective is the central pillar of the study that determines the appropriate methodology and frames the anticipated findings (Fig. 3). In crafting the manuscript, the Introduction, Methods, Results, and Discussion are all built around the hypothesis or question. Studies with a strong hypothesis, question, or objective are likely to yield specific findings of interest and can be clearly presented to readers. Studies that are focused on applying a new method or those in which the hypothesis, question, or objective was developed as the manuscript was being written often lack clear findings of value and do not have a strong narrative.

Interrelationships of the various elements of study design. Studies are based on a clear, precisely worded, and specifically testable/answerable hypothesis, question, or objective. The hypothesis, question, or objective is supported by a clear rationale that identifies the problem or the gap in current knowledge. The study design and methods are developed to serve the hypothesis, question, or objectives of the study. The methods are expected to lead to an outcome that clearly confirms or refutes the hypothesis, answers the question, or fulfills the study objectives. In so doing, the anticipated results of the study fill the abovementioned gap in knowledge and thus address the rationale of the study.
The hypothesis, question, or objective must be precise and specific. The aim—if the study proceeds according to plan—is for the results to definitively confirm or refute the hypothesis, conclusively answer the question, or completely satisfy the objectives. The objectives need not be grandiose or world-changing but must be precisely achievable: vague or unattainable objectives are not of value as a solid basis for a study. Recent studies provide examples of effective, specific, and testable hypotheses: “the histologic diagnosis of pectinate ligament dysplasia (PLD) [does] not correlate with the gonioscopic diagnosis of PLD, and PLD cannot be diagnosed solely by routine histological examination in canine globes affected with chronic glaucoma,” 3 and “myocardial CPV-2 infection is…associated with cardiac damage in dogs less than 2 years old.” 16
Hypotheses must be specified before the study is conducted. If hypotheses are formed after observation of the data, then the study is merely exploratory, and testing the hypothesis in a new population of study subjects would be needed to confirm the hypothesis. When hypotheses are formed as the paper is being written, this simply fits the “hypothesis” to the observed data. This is the reverse sequence—the tail now wags the dog—and thus invalidates the merits of hypothesis testing.
The methodology is not part of the hypothesis, question, or objective. The methodology is subservient and developed subsequently (Fig. 3). Too often we think of cool methods and only later create a study objective, but this is the reverse of effective study design. Investigations that are not built upon on specific objectives can become an exercise in data collection with the hope of discovering an unexpected association. This may yield interesting data but is highly exploratory, and a confirmatory study would be necessary to validate such an association. In the same way, studies that measure a myriad of parameters generate heaps of information but can become unfocused and lack statistical power to make valid inferences.
Descriptive vs Analytic Studies
What study design is most appropriate and practical to address the hypothesis, question, or objective of the study? Here, we consider the gritty details of study design: descriptive vs analytic studies, the merits of various types of analytic studies, retrospective vs prospective enrollment, the number of study subjects, validation of study subjects, considerations of causal inferences, and the thorny topics of bias, confounding, and chance associations.
Descriptive studies are sometimes dismissed as the poor cousins in the family of study designs, providing only weak evidence because unmeasured variables are not controlled and have an unknown impact on the findings. Furthermore, cases represented in laboratory archives are a highly selected population that may differ in important ways from those cases of the same disease that were never sampled. For instance, those dogs whose tumors were biopsied and subsequently archived may have a substantially different clinical outcome from those dogs whose owners did not pursue advanced diagnostic tests. Finally, the lack of a control group leaves readers wondering whether the observed findings might also be seen in some normal animals, particularly for species or tissues not often examined. Microscopic observations of normal anatomic structures in marine invertebrates, inclusion bodies in the ganglia of normal coatis, and the variety of age-related lesions in older animals provide examples of “background” findings that might be incorrectly attributed to a disease if controls were not also examined. 2,13,24,32 These issues are particularly pronounced for single-animal case reports, where the relationship between 2 findings might be explained by a host of unmeasured factors.
Despite these limitations, descriptive studies provide undeniable value to the daily practice of veterinary pathology. They focus on communicating objective factual observations, relatively free of inference. As keepers of the archive, pathologists have unique access to a nearly unlimited collection of laboratory samples. For some questions, descriptive studies may be the best approach. For example, in a descriptive cohort study, a single defined population of animals initially free of the outcome is followed over time to determine the incidence of a disease or an outcome of the disease. 38 Examples include the incidence of uterine decidual reaction in mice subjected to a superovulation protocol 34 and the incidence of recurrence after excision of feline epitheliotropic mastocytic conjunctivitis. 5 Finally, the process of marshalling these cases for a study may identify patterns and generate hypotheses not considered during the routine processing of case material. Much of our knowledge in veterinary pathology is rooted in descriptive studies, and some of our most-downloaded and most-cited articles are descriptive studies of new disease conditions. Veterinary pathologists should not be apologetic about the position that descriptive studies occupy on those evidence hierarchies that were designed for evaluating human medical treatments. 11
Analytic studies offer important advantages over descriptive studies because they formally compare results between 2 groups that differ with respect to the exposure or the outcome (Table 2). Descriptive studies have no control group, so it is impossible to determine if certain findings are true features of the disease or if they are alternatively due to an unrelated characteristic of the population or the method of acquiring the study subjects. When it is relevant to the study objectives, including a meaningful control group can add considerable value and impact to observational studies (Fig. 4). If the objective of your study is to “describe” or “characterize”, try changing it to “compare” for a more powerful study design.
Some Analytic Observational Studies in Veterinary Pathology, 2016–2017.
Note that disease may represent either the exposure or the outcome, depending on whether the study investigates the causes or consequences of disease.

Citations and usage of observational studies (analytic and descriptive) and experimental studies published in Veterinary Pathology. The data show the number of citations (a) and number of downloads (b) per article based on year of publication (mean with 95% confidence interval). Analytic studies tend to be cited and downloaded more often than descriptive studies (*P < .05). Furthermore, analytic observational studies have similar or higher numbers of downloads and citations as experimental studies, even though the latter is classically considered more a robust approach to knowledge discovery.
An overview of the classic types of observational studies is provided in Fig. 1 and detailed elsewhere. 14,37 The merits and limitations of different analytic study designs are outlined in Table 3.
The Classic Analytic Observational Study Designs.
Prospective vs Retrospective Enrollment
Retrospective enrollment makes use of existing materials and data, which is easier, faster, and less expensive and generally allows increased numbers of study subjects for greater statistical power. Most studies published in Veterinary Pathology involve retrospective enrollment because veterinary pathologists have such easy access to marvelous archives of case material.
Conversely, prospective enrollment allows a standardized approach to sampling and analysis, and the scope of data collection is intentionally designed. Thus, prospective enrollment may avoid bias and reduce variability by minimizing unintentional differences among samples. Furthermore, prospective sampling may be necessary for specialized analyses, such as flow cytometry or analysis of gene expression. Thus, use of prospective studies is one of the main recommendations for improving studies in pathology and laboratory medicine. 28 But they are far more costly and time-consuming, and it may be impossible to acquire a sufficient number of cases within a reasonable time frame. It is an unstudied marvel of biology how even common diseases seemingly disappear once a prospective study is under way.
Study Design and Causal Inferences
Observational studies that focus on causality or pathogenesis require particular attention to study design. In experimental studies, the subjects may be more uniform and there is controlled manipulation of the exposure (ie, the causative agent or the earlier event in the pathogenesis). In contrast, these factors are uncontrolled in observational studies, making it inherently difficult to show causality. When an observational study reveals an association between 2 factors, Hill’s criteria 21 (Table 4) provide a framework for considering whether the relationship is causal.
Hill’s Criteria for Evaluating the Strength of Evidence That an Observed Association Is Causal. 23
The fourth of Hill’s criteria—the temporal relationship of cause and effect—can be problematic for studies using single biopsies or samples obtained after death. Specifically, it may be impossible to determine the causal sequence if the 2 variables are measured at a single point in time. For example, a landmark study 46 identified the association of equine multinodular pulmonary fibrosis (EMPF) and equine herpesvirus 5 (EHV-5) infection. However, case-control or cross-sectional study designs cannot confirm the sequence of causation: does EHV-5 infection cause EMPF, or does the abnormal tissue environment in EMPF favor infection with or replication of EHV-5? In this example, objective identification of the causal sequence was later supported by an experimental study 47 (Hill’s eighth criterion in Table 4) and by comparative studies (Hill’s ninth criterion); 45 a cohort study would be an alternative approach in other contexts.
Sometimes, the direction of causality is obvious. In a cross-sectional study of zebrafish that identified an association between the genetic mutation “smoothened” and the occurrence of endocardiosis, it is not plausible that endocardiosis caused the genetic mutation, but it is plausible that the mutation caused endocardiosis. 12 Similarly, the causal sequence is self-evident when death is the outcome, for example, that canine mammary carcinosarcoma confers a poor survival time compared to other types of mammary carcinoma. 33 In other studies, it might be reasonable—based on existing knowledge—to infer a causal sequence, for example, that systemic hypertension in cats with chronic renal failure led to vasa vasorum arteriopathy rather than the converse. 23 Nonetheless, the sequence of causality is not always clear in cross-sectional and case-control studies: pancreatic islets of diabetic cats more frequently contain T and B lymphocytes compared to pancreatic islets of control cats, but we cannot be sure if the lymphocytes are responding to the pathologic process in the islets or if they caused the loss of islet cells. 48
Longitudinal sampling of initially outcome-free animals in a cohort study (or exposure of animals known to be free of the disease, in an experimental study) may be needed to show that exposure precedes outcome. For example, the Golden Retriever Lifetime Study follows dogs that are initially cancer-free over their lifetime and is expected to identify risk factors for later development of 4 types of cancer. 19 Studies that make use of longitudinal sampling are rare in Veterinary Pathology.
Consider also if the study measures new occurrences of a disease (ie, incident cases) or existing cases in a population (ie, prevalent cases). For prevalent cases, it may be impossible to determine if the cause (the exposure) preceded development of disease (the outcome). Furthermore, since prevalence is a factor of both incidence and duration of disease, case-control and cross-sectional study designs may not discern whether an exposure causes development of new cases or increased survival of existing cases. For example, consider a cross-sectional study with the valid observation of a higher prevalence of amyloidosis in captive compared to free-ranging island foxes. 17 It is plausible that factors related to captivity increase the likelihood that foxes develop amyloidosis, but an alternative explanation is that foxes with amyloidosis survive longer in captivity than in the wild. Thus, cohort studies can be logistically difficult because of the need to identify animals initially free of the outcome and then follow them over time to determine development of the outcome. Nonetheless, cohort studies are considered a stronger study design than case-control and cross-sectional studies because they measure development of new cases rather than existing cases and confirm that the proposed cause preceded development of the outcome.
Selecting Study Subjects: Ethics and Permissions
All research involving live animals or samples obtained for the purpose of the study require approval by an institutional animal care and use committee, which ensures that the study is conducted in accordance with relevant legislation. Permits may be required to obtain or possess samples obtained from threatened species or from free-ranging wildlife. Permission may be necessary to publish findings based on case material owned by other individuals or by an institution. Written informed consent is required if samples are obtained from client-owned animals for the purpose of the study. The situation is less consistent for studies conducted on archived laboratory materials sampled for the purpose of diagnosis. In many jurisdictions, these samples may be considered the property of the laboratory depending on agreements at the time of sample submission, and written informed owner consent is not required. However, these laws vary among jurisdictions and may change over time, and we expect this could become an emerging issue in the future.
Selecting Study Subjects: Unbiased Sampling, Effective Controls, and Inclusion and Exclusion Criteria
When selecting animals to include in the study, choose a contiguous series of subjects in each study group or a randomly selected subset. It would introduce considerable bias if we included only those cases that were the most interesting, had the most solid diagnosis, or were most memorable. This is an important critique of single-animal case reports—the reported cases are highly selected and thus may not be representative—but the situation is only improved in an analytic study if the subjects are appropriately selected. Many observational studies use all of the available cases, whereas our archives contain far more controls than are necessary for the study. How do we select which controls to include? In general, selection of a subset of study subjects from the larger population should be done by refining the inclusion and exclusion criteria or by a formal randomization method. Other approaches—purposive, convenience, or haphazard sampling—are likely to bias the outcome.
Selecting controls is key to the study design, not an afterthought. Choose controls that offer the best comparison to the population being studied, in the context of the study objectives. Often, the best controls are not normal individuals but ones with an alternative disease. For example, in a study using calretinin immunohistochemistry to identify the neural tracts affected by equine degenerative myeloencephalopathy, 2 groups of controls were included: normal horses to validate the use of calretinin immunohistochemistry for tracing neural tracts and horses with “other spinal disease” to show that calretinin-positive spheroids were unique to equine degenerative myeloencephalopathy and not found in other spinal diseases. 15 Similarly, in a study that determined the sensitivity and specificity of histologically visible cilia-adherent bacteria for diagnosis of Bordetella bronchiseptica pneumonia compared to the gold standard of bacterial culture, other forms of bacterial pneumonia were considered to be a more appropriate control instead of normal lung. 41 To measure the specificity of surfactant protein A for diagnosis of pulmonary carcinomas, 113 nonpulmonary neoplasms were used as controls. 4 Finally, unaffected marine invertebrates were important controls, to demonstrate that the histologic findings in those with either spontaneous or experimentally induced copper toxicosis were not simply normal findings in these little-studied species. 24 Choosing the most appropriate controls is a fundamental basis for any analytic study and is completely dependent on the details of the hypothesis, question, or objective of the study.
Inclusion and exclusion criteria must be defined for both study groups, that is, for the cases as well as the controls. Inclusion and exclusion criteria are a precisely detailed description of how study subjects were selected from the population and the reasons that some subjects were omitted from the study. The importance of clear inclusion and exclusion criteria is not simply to allow replication of the experimental approach. More important, these criteria allow readers (and indeed investigators) to understand potential sources of selection bias that could influence the study outcomes. Effective descriptions of inclusion and exclusion criteria read as poetry to discerning journal editors: A search of the archives between June 2007 and November 2014 was performed [ie, the method of selection of a contiguous series of cases and controls], and cases limited to cats at least 1 year of age were identified using the keywords feline or cat and endomyocardial fibrosis, endocardial fibrosis, endocardial scar, endomyocarditis, or restrictive cardiomyopathy [ie, the inclusion criteria for cases]. We excluded cases having keywords hypertrophic and dilated [ie, the exclusion criteria for cases]. Control cases were identified using keywords describing acute trauma, neoplasia, or other noncardiac causes of sudden death [ie, the inclusion criteria for controls]. A similar age distribution of control cases was selected from the same time period and source [ie, the method of matching controls and cases].
29
Selecting Study Subjects: Unique Aspects of Archived Laboratory Material
Consider the target population (eg, all dogs with lymphoma), the source population from which samples were drawn (all dogs with lymphoma samples in the laboratory archive) and the study population (the dogs entered into the study because they meet the inclusion and exclusion criteria), and how these populations might differ. For example, animals represented in laboratory archives may be more likely to have had a higher level of veterinary care, been treated with antibiotics, be affected by serious disease, and be affected by risk factors for other diseases. How will these factors affect the findings and the external validity of the study—the relevance of the findings to the general population of interest?
The external validity of the study represents a balance between 2 opposing approaches. The first is to focus on a narrowly defined study population in order to minimize variability of the data and avoid systematic differences between study groups. A pitfall with this approach is that the results may not be relevant to the general population. The second approach is to use a sufficiently diverse study population that the results are applicable to other populations of animals and to routine practice. However, this may increase the variability of the results.
Both study groups should be sampled from the same population, but this is troublesome for laboratory-based studies where the archived material is of diverse and ill-defined provenance. The detailed circumstances of these animals’ life circumstances are usually unknown and not often considered when selecting study subjects—particularly for the controls. Thus, there is considerable risk that study groups will differ with respect to unmeasured variables such as those shown in Table 5.
Factors to Consider When Evaluating the Suitability of Control or Comparison Groups.
Comparison groups should be similar, except for the factor of interest. Other factors that differ between groups may cause bias or confounding if their frequency or distribution is not similar between study groups and they are not accounted for by analysis.
Uneven distribution of these variables between the different study groups can introduce bias or confounding. This problem—the possibility that clustering of unmeasured variables might create the false appearance of an association between the exposure and outcome being studied—is perhaps the major limitation of observational analytic studies based on archived laboratory samples. Bias and confounding are considered in more detail below.
When working with archival samples, the process of selecting study subjects is often iterative. Reviewing the details of the initially selected cases and controls usually identifies problems, and it is typical to revise and clarify the inclusion and exclusion criteria, then restart the selection process. Repeating this process is tedious, but it is far better to solidify the study population at the beginning than to make changes after collecting the data.
Selecting Study Subjects: Numbers of Study Subjects
It is important to conduct a formal sample size calculation prior to carrying out the study, to determine the number of study subjects required to identify a significant difference between study groups. Online tools are available (eg, Statulator, http://statulator.com/SampleSize/ss1P.html; StatCalc-EpiInfo, https://www.cdc.gov/epiinfo/index.html). If the outcome of interest is a proportion (binary scale), the calculation requires desired values for the level of confidence (typically 0.95) and statistical power (typically 0.8), as well as an estimate of the effect size. For binary variables, the effect size can be the odds ratio or risk ratio that the investigator considers to be meaningful, and this is estimated based on the anticipated proportion with the outcome in the exposure-positive and exposure-negative groups. If the outcome of interest is measured on a continuous scale, the calculation requires that investigators estimate a meaningful difference in the outcomes between the exposure groups, as well as the estimated variability in the outcome, and the desired levels for confidence and power. Thus, although the sample size calculation requires estimates for some variables unless a pilot study is done, it can provide an informative estimate of sample numbers to suggest the feasibility of finding a meaningful difference in the outcome between the exposure groups.
Inadequate number of study subjects is a common limitation of studies in pathology and laboratory medicine 28 and is a frequent critique of manuscripts submitted to Veterinary Pathology. Conversely, studies with large numbers of study subjects are admired by readers and reviewers. However, even if overall case numbers are large, the tendency for pathologists to be “splitters” rather than “lumpers” leads to low numbers in some categories. This was addressed in studies of canine pulmonary carcinoma and mammary carcinoma by including sufficiently large numbers of cases—67 and 229, respectively—to permit meaningful analysis of tumor subtypes. 4,33
Investigators have control over the number of study subjects. Studies of archived cases could cover a broader time period. It may be possible to relax the inclusion criteria and limit the exclusion criteria and still fulfill the study objectives. Collaboration among institutions is the most effective way to increase case numbers and brings added benefits of increasing the external validity, establishing professional relationships, adding expert insights, and fomenting discussion of the study material. For example, an investigation of oxalate nephrosis in cheetahs included cases from Southern Africa, North America, and France, and considered geographic origin in the statistical analysis. 30 Finally, we should ensure that our laboratory information management systems can be effectively queried, so that a contiguous series of cases can be retrieved in a standardized manner.
Refining the number of study subjects in each group can optimize statistical power. If cases are frequent, aim for a 1:1 ratio of cases and controls. If cases are rare enough that it will be difficult to achieve statistically significant results, increasing the number of controls will increase the statistical power of the study. However, using more than 3 or 4 controls for each case increases the cost of the study without much increase in statistical power. Conversely, having fewer controls than cases would be rarely justified.
Bias, Confounding, and Chance Associations
Take a deep breath, intrepid pathologist, as we plumb the final depths of epidemiology. This road is a hard one but leads to a truth that we all must know.
A statistically significant association between an exposure (eg, presence of a virus in tissues) and an outcome (eg, lesions of a particular disease) is a welcome finding in any observational study and cause for celebration. But before considering that the relationship is causal—that the virus did indeed induce the lesion—some critical analysis is in order. Observational studies are susceptible to spurious associations that are not easy to detect, so investigators must carefully search for alternative explanations of their data.
Consider what factors might differ between the study groups and how these differences might poison the findings of the study. The study groups obviously differ in ways defined by the inclusion and exclusion criteria, but they might be dissimilar in other ways, as listed in Table 5. If the frequency or distribution of 1 of these factors differs between the 2 study groups, this could bias the association between the exposure and the outcome. For example, this might give a false appearance that the exposure was associated with the outcome, or it might lessen or obscure a true association between exposure and outcome.
These factors may be particularly problematic for laboratory data. In designing a clinical study with prospective enrollment, one would never select cases from a referral hospital and controls from a humane society practice, or process and analyze case samples with one method and control samples with another. But these and other factors are surely variable and largely occult for archived laboratory case material, increasing the likelihood of spurious conclusions as a result of random or systematic differences between study groups. Furthermore, those clinicians, pathologists, and laboratorians who originally managed and investigated the cases (and the controls) did so with full knowledge of the clinical details. Consider how this knowledge might have affected the case management or the laboratory investigation and how these differences between study groups might affect the findings of the study.
Finally, note the importance of the “independence of study subjects.” Using study subjects that are not independent of each other violates the assumptions of many statistical analyses and may introduce bias. For example, if an otherwise heterogeneous study population contained several individuals from the same herd or household, these subjects may not be independent. At a broader level, clustering of data is common within animal populations because of their population structure and may involve the exposure variable, the outcome variable, or both. In addition to affecting the statistical analysis, clustering of data may lead to bias if it affects both the exposure and the outcome. Furthermore, statistical methods to control for clustering may reduce the power of study, thus requiring larger sample sizes.
Mitigation of Bias, Confounding, and Chance Associations
It is important to recognize potential bias and confounding factors because their effects can be minimized by measurement, exclusion, statistical analysis, or matching. Exclusion. Eliminate the effects of confounding by excluding a subset of the study subjects. In the example of selection bias from Table 6, exclude study subjects from primary care clinics if they are few and if they complicate the association of nodal metastasis and survival. Measurement. As the study is being conducted, collect data on potential sources of bias and confounding, and then compare their frequency in a data table. For example, compare the study groups with respect to factors, including those listed in Table 5. Is the distribution of ages the same in cases and controls? Does the proportion of large vs small dog breeds differ between the study groups? If so, consider how the differences might affect the findings of the study. As an example, physeal lesions were studied in bulls raised in the same geographic area with similar husbandry practices. The similar ages and body weights of cases and controls suggested that these were not confounding factors.
26
Analysis. Multivariable analysis or stratified analysis is frequently used to analyze and mitigate the effects of confounding. For example, multivariable analysis was used to control for the effect of age and sex in comparing the prevalence of bacterial infection in St Lawrence belugas in 1983–2002 vs 2003–2012
25
and would be effective for analysis of the sources of bias shown in Tables 5 and 6. Matching. If potentially confounding variables can be identified at the time the study is designed, study groups could be intentionally matched when study subjects are selected. For example, in a study of X-linked hereditary nephropathy in Navasota dogs, cases and controls were matched during the selection process with respect to their sex.
6
Similarly, in an investigation of the relationship between squamous cell carcinoma and papillomavirus infection, case and control samples were matched with respect to sheep breed and anatomic site.
44
However, factors that are matched cannot be analyzed as risk factors: if subjects are matched based on age, age cannot be analyzed as a risk factor for the outcome. Thus, multivariable statistical analysis may be advantageous in controlling for differences between groups while allowing for assessment of the factors of interest.
Reasons for Spurious Associations in Pathology-Based Observational Studies.
Critique the Study Design
Before starting data collection, it is recommended to write a study proposal and seek peer review. The act of writing forces appraisal of the relevant literature, planning, and critical analysis. It tests the coherence of the various elements: the rationale, the hypothesis/question/objective, the study design and methodology, the expected findings, and the anticipated impact (Fig. 3). What is our current understanding, and what is the gap in knowledge that the study aims to correct? What is the important problem that the study addresses? Is the hypothesis, question, or objective based on a clear rationale, and is it sufficiently specific? Are the study design and methodology expected to yield results that definitively test the hypothesis or answer the question? Are there conceptual flaws with respect to showing causality? Might unmeasured factors cause bias or confounding? Will the expected findings have the anticipated impact and address the problem or gap in knowledge that was described in the rationale? Revisit the questions posed in Table 1, as an approach to refining the study design and methodology. If there is doubt that the study results will be definitive or valuable, now is the time to refine the methods or revise the hypothesis, question, or objective. Writing a clear and detailed description of the rationale, anticipated findings, and significance of the study might seem tedious work, but it allows effective critique of the study design, ensures that the study is solidly guided by a strong and specific hypothesis or question, and forms a guide for the decisions that must be made as the study is conducted (Fig. 3). Moreover, writing the ensuing manuscript will be a breeze if this structure is in place from the beginning.
Value Added
Adopt a discovery mind-set during the various phases of the study. The goal of an observational study is not usually to confirm what is known but to discover something new. Critically analyze the emerging data: consider alternative interpretations and what might be done to evaluate the differing possibilities. After analysis of the initial results, consider elements that could be added to give the study more value or impact. Discovery is iterative, and it is a mistake to anticipate a simple progression from planning to execution to publication. Initial results beget additional investigations that greatly strengthen the overall work with limited increased effort.
Use insights from a single case as the starting point for a more comprehensive study. A study of B. bronchiseptica pneumonia in dogs was initiated by the microscopic observation of bacteria adherent to cilia, but the analytic study yielded information well beyond that of the index case. 41 A novel herpesvirus was identified in a single bottlenose dolphin with benign genital plaques, which stimulated development of a case series and eventually made use of banked sera from the same animals to show that seroconversion to the virus occurred at the age of onset of sexual behavior. 43 A single case report of a pig with amyloidosis was transformed by bioinformatic analysis of the amyloid amino acid sequences and in vitro testing of amyloid fibril formation to substantially advance the understanding of pathogenesis. 22 Thus, useful observational studies often arise from but go far beyond the observations on a single case.
Finally, consider value-added outcomes that give the study a broader impact. Mechanistic studies may have greater application if the pathologic findings can be related to clinical outcomes. For example, evaluating the survival of dogs with mast cell tumor was essential to the impact of studies on receptor tyrosine kinase expression 42 and cytologic grading. 7 Similarly, morphologic analysis of feline chronic kidney disease was given added clinical relevance by analyzing the relationship to measures of renal function. 9 Alternatively, consider whether an analysis of causes or risk factors could be added to a descriptive study by including an appropriate comparison group. For example, a study of endocardiosis in aging zebrafish described the pathologic findings but also identified associations with recirculating water systems, commercial diets, and a mutant smoothened gene. 12 Likewise, a description of amyloidosis in island foxes identified increased lesion severity in older, female, and captive foxes as well as between subspecies. 17 Creativity and a discovery mind-set are the keys to identifying such opportunities for added insights. Further examples include adding genetic analysis to a study of age-related spontaneous lesions in mice, 20 comparing young and old animals to increase the value of a study of background lesions and clinical pathology parameters in laboratory Beagle dogs, 2 quantitative analysis to validate the concurrence of cardiac fibrosis and chronic renal lesions in aged chimpanzees, 10 and comparing findings in wild and laboratory rats with respect to understanding the pathogenesis of cardiomyopathy in this species. 36
These ideas are summarized in Fig. 5. We hope that veterinary pathologists can apply these principles and use imagination, insight, collaboration, and laboratory archives bursting with samples to transform their daily work into focused observational studies that provide value and impact for advancing our knowledge of animal disease.

Considerations for the effective design of observational studies in veterinary pathology.
Footnotes
Acknowledgements
We thank Lauren Sergejewich, Siobhan O’Sullivan, and David Pearl for their contributions.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Some authors of this commentary are editors of Veterinary Pathology. This editorial commentary was not peer-reviewed.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Aspects of this article were supported by a grant from the Natural Sciences and Engineering Research Council of Canada (RGPIN-2017-03872, J. Caswell).
