Abstract
Osteoarthritis (OA) is unquestionably one of the most important chronic health issues in humans, affecting millions of individuals and costing billions of dollars annually. Despite widespread awareness of this disease and its devastating impact, the pathogenesis of early OA is not completely understood, hampering the development of effective tools for early diagnosis and disease-modifying therapeutics. Most human tissue available for study is obtained at the time of joint replacement, when OA lesions are end stage and little can be concluded about the factors that played a role in disease development. To overcome this limitation, over the past 50 years, numerous induced and spontaneous animal models have been utilized to study disease onset and progression, as well as to test novel therapeutic interventions. Reflecting the heterogeneity of OA itself, no single “gold standard” animal model for OA exists; thus, a challenge for researchers lies in selecting the most appropriate model to answer a particular scientific question of interest. This review provides general considerations for model selection, as well as important features of species such as mouse, rat, guinea pig, sheep, goat, and horse, which researchers should be mindful of when choosing the “best” animal model for their intended purpose. Special consideration is given to key variations in pathology among species as well as recommended guidelines for reporting the histologic features of each model.
Keywords
The importance of osteoarthritis (OA) as a chronic health issue, especially among older adults, is unquestionable. An estimated 27 million individuals are affected with OA in the United States alone, 78,106 and the high prevalence of this disease results in a significant economic burden for the nation’s health care industry, costing an estimated $185 billion annually. 70 Yet, the pathogenesis of disease, particularly early disease, is still not completely understood. As a result, development of effective tools for early diagnosis and disease-modifying therapeutics has been hampered.
The major challenge in managing—and studying—naturally occurring OA in human patients is that the course of disease is often slow and unpredictable, 93 and clinical symptoms (ie, joint-related pain) occur late in the disease process and may not accurately reflect molecular events and structural changes within the joint. 58,147 The utility of studying animal models to overcome some of these issues has long been recognized, and as a result, numerous animal models for OA have been developed over the past 50 years. Researchers have used these models to gain insights into disease onset and progression, as well as to aid in the development and evaluation of new diagnostic tools and therapeutics. With the variety of models now available, a major challenge lies in selecting the “best” model when designing a study, keeping in mind that no single “gold standard” animal model of OA exists that accurately reflects all aspects of human disease. 36,136
A search of the PubMed database utilizing the search phrase “animal model osteoarthritis” returns more than 2100 articles (1963–June 2014), with nearly 80% of these published in the past 10 years (Fig. 1). Descriptions of naturally occurring OA (or degenerative joint disease) in animals can be found even earlier. 61 –63 The earliest reports utilized inbred laboratory strains of mice and rats, but by the early 1990s, in vivo models had been reported in a variety of species, including rabbits, dogs, guinea pigs, sheep, horses, and goats (Fig. 1). Small animal models (rodents and rabbits) remain the most popular; of the large animal species, dogs are by far the most commonly used (Fig. 2).

Number of published reports utilizing animal models of osteoarthritis, from 1963 to June 2014. The earliest report for each major species highlighted in this review is indicated.

Published reports using animal models of osteoarthritis: relative use of (a) each major species highlighted in this review and (b) small animal models compared to large animal models.
The argument has been made that as OA is a heterogenous disease in humans, no single animal model should be expected to be relevant for studying all aspects of the disease process. 3,29,81 Thus, deciding which animal model should be used in a particular study is first and foremost dependent on defining the specific question that needs to be answered. Only then can the pertinent benefits and drawbacks of individual models be considered and a decision made. The purpose of this review is to provide some general considerations for model selection, as well as some important specific features of individual species that researchers should be mindful of when choosing the “best” animal model for their intended purpose. Special consideration is given to key variations in anatomy (Table 1) and pathology among species as well as recommended guidelines for reporting the histologic features of each model.
Selected Articular Cartilage (Stifle/Knee) Characteristics.
aCritical defect size: unknown.
bCritical defect size: 3 mm (4–5 mm recommended).
cCritical defect size: 4 mm (≥5 mm recommended)
dCritical defect size: 6–7 mm (for both sheep and goat).
eCritical defect size: 9 mm.
General Considerations: Spontaneous vs Induced Models of Disease
Animal models of OA can be broadly divided into spontaneous (including naturally occurring and genetic models of disease) and induced (primarily by surgical manipulation or intra-articular chemical injection). An advantage of the slowly progressive spontaneous models of disease is that they likely more closely mimic the course of primary OA in humans (particularly naturally occurring disease in the dog and horse) without the need for intervention. However, these models take more time to develop and tend to be more variable in their outcomes, resulting in more costly studies in terms of the length of time that animals must be housed and the potentially larger number of animals required to achieve an appropriately powered study design. 75,136,145 Genetically modified strains of mice as models of OA have become tremendously popular over the past few years, with 135 distinct strains reported in a 2013 review. 79 The advantage to these is that the effects of a single gene in OA pathogenesis can be investigated. 75,81 However, as naturally occurring OA is almost certainly polygenic in nature (ie, relying on the effects of many genes), these models likely oversimplify the disease process; thus, results of therapeutic interventions may not always translate, either to other animal models or to humans. 79 It is also of note that different models of disease can have opposite effects in the same genetically modified strain, 81 so care must be taken in study design and interpretation of results.
A variety of surgically induced models have been reported, including partial or total meniscectomy, destabilization of the medial meniscus, meniscal tear, anterior cruciate ligament (ACL) or posterior cruciate ligament transection, medial and/or lateral collateral ligament transection, creation of articular groove(s), osteotomy, transarticular impact, and intra-articular osteochondral fragmentation. 75,81,136 Each model relies on a combination of joint instability, altered joint mechanics (ie, changes in load bearing or joint congruence), and inflammation to induce OA lesions. 75,136 It is of note that the majority of these procedures are performed in the stifle—the equivalent of the human knee—with the exception of osteochondral chip models in the horse, which are performed in either the carpus (see review 96 ) or the metacarpophalangeal joint. 20 Surgical models have the advantage of repeatability as well as rapid onset and progression, 75 but the rapid course of disease makes them less ideal models of spontaneous OA. 136 These models are considered to be more appropriate reflections of posttraumatic OA in humans, although the progression of disease is much more rapid, perhaps because use of the destabilized limb is generally not decreased. 12,79,136 It has been argued that the rapid onset of lesions after surgery may make them less responsive to interventions; in fact, for some models, “pretreatment” with the therapeutic compound of interest is recommended. 12 Additionally, there is some evidence for different responses to therapy based on the mechanism of injury. For example, intra-articular injection of hyaluronan was found to have a beneficial effect in models of meniscal injury, a lesser but still beneficial effect in models of ACL injury, and little to no effect in models of direct cartilage damage. 37 It must also be remembered that translation of therapeutic outcomes from surgically induced models to naturally occurring disease (or vice versa) cannot be assumed. 122
Current chemically induced models of disease primarily use monosodium iodoacetate (MIA), although historically, papain was widely used. 15 Other agents have been reported, including collagenase, carrageenan, and Freund adjuvant. 75,81 These models are primarily used for studying OA pain-related behaviors, 54,89 but their validity as clinical models for OA has been questioned. 122,136 This is largely due to the widespread cell death and rapid joint destruction that occur, particularly secondary to MIA injection, which would not be considered typical for either spontaneous or posttraumatic OA. 81 It is interesting to note that very few studies investigating structural changes of disease, even therapeutic interventions for these structural changes, have evaluated pain. Indeed, since pain occurs late in the disease process, it is considered to not be a feature of most of these models; however, given the evidence suggesting that different mechanisms of injury may result in varied molecular mechanisms of pain, this seems to be a potentially serious oversight. 81,89
General Considerations: Methods of Outcome Assessment
Histopathology has been and will likely continue to be the gold standard for outcome assessment in animal models of OA. However, there has increasingly been a call for development and standardization of less invasive measures of disease onset and progression, as well as response to treatment. These include imaging modalities (ie, magnetic resonance imaging [MRI], computed tomography [CT]) and biochemical biomarkers measured either systemically (ie, serum or urine) or locally (synovial fluid). 122,136
Although histopathology findings are nearly universally reported, the variety of scoring systems reported in the published literature make it difficult to compare results across studies. This in turn has led to a recent concerted effort to standardize histologic scoring recommendations for human disease and animal models of OA. The first macroscopic grading system for OA cartilage, proposed by Collins and McElligott, 30 was succeeded by the Mankin score (or Histological-Histochemical Grading System), which was based on microscopic analysis of late-stage hip OA. 90 The Mankin scoring system (and many modifications thereof) has been widely used, but it has been reported to be less useful for early-stage disease and to be subject to significant intra- and interobserver variability. 125,128 As a result, the Osteoarthritis Research Society International (OARSI) established a working group in 1998 to develop a new grading system that would meet the guiding principles of “simplicity, utility, scalability, extendibility, and comparability.” 125 The resulting system was published in 2006 and borrows concepts from cancer pathology to address the severity (“grading”) and extent (“staging”) of OA lesions within a joint. 125 This system has subsequently been validated in animal 33 and human 115 articular cartilage; however, a follow-up initiative by the OARSI was aimed at standardizing scoring systems for each major species used as an animal model of OA. 3 The results of this effort were published in a special supplemental issue of Osteoarthritis & Cartilage. 3
This report also contained recommended guidelines for processing histologic samples 132 as well as histomorphometry of joint tissues. 112 Additionally, unifying terminology for reporting results was proposed 124 as well as a review of statistical analysis methods. 114 A more detailed overview of this work is beyond the scope of this review; however, pertinent points are highlighted under the following individual species.
In contrast to the concerted effort that has been made to standardize reporting of histologic results, there is as yet no consensus on the gold standard for imaging modalities. Although radiographs have long been the standard for diagnosis and evaluation of progression of human OA, it is widely recognized that these provide an insensitive measure of disease. 110,136 MRI is of particular interest as a potential modality to evaluate longitudinal changes in both naturally occurring and experimentally induced disease, and new techniques are being developed and refined (see recent review 26 ). Incorporation of these advanced imaging techniques into study designs for animal models of OA has been recommended 122,136 and, indeed, is being increasingly reported in various species. 41,52,74,99
There is also still much debate regarding appropriate biochemical biomarkers to measure and report for studies of OA. A consensus report regarding the classification and utility of various biomarkers for the purposes of drug development was published by OARSI in 2011. 71 However, nearly all of the commercially available markers listed on the “recommended panel” in this report are measured in the urine and/or serum and, as a result, could be influenced by systemic factors not related to the joint of interest. Synovial fluid biomarkers have the advantage of reflecting the local joint environment, but there is as yet no consensus on which biomarkers should be measured. In addition, due to their size, small animal models have a limited amount of synovial fluid present, which is difficult to collect except by lavage. There is agreement that (1) skeletally immature animals are especially inappropriate for studies involving biochemical biomarkers, as growth can significantly affect biomarker levels, and (2) no single biomarker is likely to become the gold standard for diagnosing disease or assessing progression. 122 Instead, a combination of markers reflecting catabolism and anabolism, as well as those reflecting inflammation (ie, synovitis), is likely to provide the most accurate information about disease state. 122,136
Small Animal Models: Mouse, Rat, Guinea Pig, and Rabbit
Advantages to small animal models of OA include relatively low cost, ease of handling, and availability of housing. As a result, they have been particularly popular for drug-screening studies as well as investigations of highly targeted aspects of disease pathology (ie, related to alterations in single genes). 12,54,136,145 The primary disadvantages of these models are related to dissimilarities in tissue structure and joint mechanics between these species and humans as well as their small size, which limits the amount of tissue available for biochemical studies. 2,116,127
Mouse
Differences in skeletal maturation and aging among genetically distinct strains of mice have long been recognized. 133,134 Particular strains, notably STR/ort and C57BL/6 (among others), are considered predisposed to developing spontaneous idiopathic OA, 49 while other strains, notably CBA, are considered to be resistant to the development of spontaneous disease. 123 Additionally, induced OA models can have widely varying results in different strains of mice. For example, more severe lesions of OA were reported after surgical destabilization of the medial meniscus in 129/SvEv mice than in DBA/1 mice. 49 Similarly, a much greater proportion of C57BL/10 mice (100%) developed OA after intra-articular collagenase injection than did C57BL/6 mice (25%). 143 Due to these differences, genetic background should always be reported along with other important factors, such as sex, age, and housing (including artificial light/dark cycles, if used), 145 and direct comparisons among outcomes in mice of different genetic backgrounds must be performed with caution.
Although mice prone to spontaneous development of OA—either idiopathic or resulting from known spontaneous mutations (eg, Del1 mice with a short deletion in type II collagen)—have been widely used as disease models, there has been a marked increase in recent years in the availability and use of genetically modified mice, particularly conditional knockouts. As expected, genetically modified mice are most useful for studying the effects of a particular gene on the development of disease; the genes selected for manipulation are typically involved in cartilage matrix degradation, chondrocyte differentiation or apoptosis, bone turnover, and/or inflammation. 79 Genetically modified mice may be compared to their wild-type littermates to evaluate spontaneous development of disease or may be used in combination with induced OA models (surgical or chemical). 49 In cases in which the genetically modified group is protected against cartilage degradation, the specific genes or gene products involved may be considered potential disease-modifying drug targets. Therapeutics developed on the basis of such studies have been successfully tested in mice; however, replication of results in additional species is required, and to date, no target identified by this method has successfully translated to a clinically relevant therapeutic agent. 79
Ideally, as for any species, only skeletally mature mice should be used in models of OA. It can be difficult to judge skeletal maturity in mice since it is normal for them to have growth plates that do not close completely, 133 but it has been suggested that 10 weeks is the minimum age at which this species should be entered into any study. 122 Genetically modified mice being compared to wild-type littermates should not be evaluated before the time that spontaneous lesions would be expected to occur, generally between 9 and 12 months of age, 122 although this can be strain dependent as noted above. It is of note that older mice (12 months of age) have been found to develop more severe lesions than younger mice (12 weeks of age) after destabilization of the medial meniscus surgery. 83 In addition to surgically related lesions, sham- and nonoperated control joints were reported to have mild OA lesions in this study, and gene expression was markedly different between the older and younger mice. 83 This suggests that it may be preferable to use older mice in OA models, as findings reported in young mice may not be directly translatable to spontaneous human disease, which overwhelmingly occurs in older adults.
It cannot be ignored that the small size of the mouse results in marked differences in biomechanical loading of their joints when compared to humans. There are also notable differences between the cartilage anatomy of mice and humans, including decreased overall cartilage thickness (on average, 30 μm; 50-fold thinner than humans), a thick layer of calcified cartilage (equivalent or thicker than the noncalcified cartilage layer), and lack of distinct superficial, transitional, and radial zones of chondrocytes (Figs. 3, 4). 50 Additionally, chondrocyte apoptosis has been reported as a prominent feature of early OA in this species. 98 Macroscopic lesions are difficult to stage, and partial-thickness lesions are rarely recognizable possibly because cartilage clefts tend to develop at the tidemark, resulting in full-thickness lesions. 98 However, the small joint size does have the advantage that the entire joint can fit into a single histologic section, and this is the recommended method for microscopic evaluation of lesions. 50 Sectioning along the frontal plane is preferred so that the medial and lateral sides of the joint can be evaluated concurrently (Fig. 5). 50

Stifle, normal adult mouse. Frontal plane section: the entire joint can be visualized on a single slide. Patella (P) is centered over femoral trochlea, and menisci (arrows) appear as elongated triangular structures below the femoral condyles (C). Hyaline cartilage of open growth plates (physes) is sectioned longitudinally (*) and tangentially (**). Also featured are insertion sites of cruciate ligaments (CL), medial collateral ligament (arrowhead), medial meniscus (short arrow), and central focus of ossification in lateral meniscus (long arrow). Note that the open growth plates in the femur and tibia are expected in this species. Separation of articular cartilage of lateral tibial plateau and calcified cartilage of lateral tibial plateau is shown (dashed line); note that the thickness of these layers is approximately equal. Hematoxylin and eosin.
A number of different histologic scoring schemes have been proposed for the mouse (see summary 68 ). Of these, the modified Mankin score has been most widely utilized, 5,28,39 but its appropriateness has been questioned due to the differences in cartilage architecture between humans and mice. 50 A modified version of an alternative scoring system originally reported by Chambers et al 25 has also been used by various investigators, 19,123 and it is this system that was recommended by the OARSI mouse working group in its 2010 report. 50 This subjective semiquantitative score ranges from 0 to 6 for structural cartilage damage in each of the 4 quadrants of the joint (medial/lateral femoral condyle and medial/lateral tibial plateau), with additional 0–3 scoring parameters for the presence of osteophytes, subchondral bone changes, and synovitis (Suppl. Table 1). 50 Multiple sections should be evaluated; subsequently, the scores can be summed across the joint (or quadrant), and/or the maximal score can be reported. Other pathologic features relating to menisci, ligaments, and so on should be noted. A semiquantitative scoring system for proteoglycan depletion (0–5) was also proposed as an adjunct to the structural scoring system (Suppl. Table 1). 50 This system was reported to have excellent intraobserver reproducibility, even among novice users, and was considered to be both rapid and sensitive enough to be an effective screening tool. 50 For cases in which a more detailed histologic analysis is required, an expanded grading system was recently proposed by McNulty et al. 98 This system uses a combination of quantitative and semiquantitative measures to evaluate 15 histologic parameters in a single histologic section of the joint, representing the most severely affected location for a given model; these parameters can then be combined via principal components analysis into factor scores. In the data set reported by the authors, the factor scores reflected articular cartilage integrity, chondrocyte viability, subchondral bone, meniscus, and periarticular bone 98 ; it should be noted that these could vary if a different model was used. This system performed well for evaluating lesions in both surgically induced and naturally occurring OA in mice and lends itself well to standardized comparisons across studies. 98 However, the benefits related to this system’s precision and comprehensiveness may be somewhat offset by its technical complexity, thereby restricting its general use.
Rat
The rat has generally been found to be relatively free of spontaneous OA, 47 making it a less ideal choice for the study of naturally occurring disease. However, surgically and chemically induced models of OA are widely reported. Similarly to mice, the growth plates in rats naturally remain open into adulthood, 31 making it difficult to assess skeletal maturity in this species; however, 3 months is the recommended minimum age at which this species should be enrolled in any study. 122 It should be noted that older rats have been reported to demonstrate more severe OA after surgical intervention than younger rats and to have mild OA changes in both sham- and nonoperated control joints as well. 42 Additionally, strain-specific responses to surgery have been observed (ie, between Lewis and Sprague-Dawley rats), although different strains tend to be favored in the literature for different models, 47 making direct comparisons difficult.
Although less widely used in other species, intra-articular injection of MIA remains a popular OA induction model in rats, particularly for studying pain. 54,89 Despite the fact that, as previously mentioned, the relevance of MIA injection as a model of human OA has been questioned due to the rapid and extensive joint destruction that is induced, 81,122,138 the model is currently favored for screening drugs because of its rapidity and reproducibility (including recognized dose-dependent effects). It has been suggested that MIA may be appropriate for screening symptom-modifying OA drugs but that its utility is limited for disease-modifying OA drugs because its pathophysiology is so distinct from naturally occurring disease. 75,138
Rat articular cartilage is thicker (∼0.1 mm) than in the mouse 31 (Table 1), allowing for the creation of full- and partial-thickness defects for studying therapeutic interventions related to cartilage repair. 54 However, the biomechanics of the rat stifle joint are markedly different from humans, resulting in different patterns of cartilage loading. 31 Additionally, spontaneous intrinsic healing of cartilage lesions has been reported in this species, 31 which may make evaluation of the efficacy of interventions problematic. Despite these disadvantages, rats are commonly used for screening of potential therapeutics prior to definitive evaluation in a large animal model. 31
When OA model outcomes are evaluated in the rat, modified Mankin scores have been most commonly reported. 1,48,111 However, the modifications are not always consistent, 47 making it difficult to compare results across studies. The grading system recommended by the OARSI rat working group in its 2010 report is somewhat more complex than either the modified Mankin or human OARSI (Pritzker) scores, but it still has good intraobserver correlation and has the advantage of being extremely comprehensive. 47 It incorporates quantitative measurements for cartilage matrix loss width, total cartilage degeneration width, significant cartilage degeneration width, zonal depth ratio of lesions, growth plate thickness, and medial joint capsule repair thickness. 47 It further utilizes semiquantitative scores for cartilage degeneration, calcified cartilage and subchondral bone damage score, osteophyte size, and synovial reaction (Suppl. Table 1). It is recommended that scoring be completed for the 3 most severely affected sections from the joint, prepared in the frontal plane. Scores for each parameter are then averaged across sections. 47
Alternative imaging techniques such as micro–computed tomography, 6,69,138 x-ray phase contrast imaging, 91 and MRI 51,141 have recently been developed for use in the rat. These techniques allow for high-resolution evaluation of joint structures in situ and have an advantage over histology in that they may allow for 3-dimensional reconstruction of relevant anatomy. 138 Micro–computed tomography has been shown to be sensitive for detecting induced cartilage lesions in a MIA model and a medial meniscal transection model in the rat, although the results could not be directly compared to histomorphometry. 138 MRI was able to demonstrate sequential changes in cartilage, meniscus, and subchondral bone marrow in an anterior ligament transection model that corresponded to histologically identified pathology. 141 In the future, this technique may allow for in vivo monitoring of disease-related changes over time.
Guinea Pig
Albino Dunkin-Hartley (or Hartley) guinea pigs are by far the most commonly used strain of this species for modeling OA, primarily due to the strong histologic similarities between spontaneous OA in this strain and human primary idiopathic OA. The guinea pig has a number of other advantages as an animal model, including the size of the joints, which allows the ability to collect sufficient joint and body fluids to evaluate biomarkers; the amenability to evaluation by a number of imaging techniques; the more rapid time to skeletal maturity than larger spontaneous models of OA; and the ease of handling of the species. 54,72
The development of spontaneous cartilage degeneration in Dunkin-Hartley guinea pigs was first investigated after recognition of lesions in the contralateral limbs of animals that had undergone an intervention to induce OA, as well as in untreated control animals. 14 Mild bilateral cartilage degeneration was subsequently reported to occur in the central region of the medial tibial plateau of some animals as early as 3 months of age, with moderate to severe lesions present bilaterally on the medial tibial plateau and medial femoral condyle in nearly all animals by 12 to 18 months of age. 14,64 It is of note that the onset and severity of lesions are strongly dependent on weight; indeed, dietary restriction resulted in a 40% reduction in OA severity at 9 months of age and a 56% reduction in OA severity at 18 months of age when compared to ad libitum–fed control animals in one study. 13 As a result, the weight of guinea pigs at various time points in any study should be reported to aid in comparison of results across studies. Although guinea pigs tend to be quite sedentary (considered to be a disadvantage when comparing their disease to that of humans, in whom athletic activity is strongly correlated to OA severity), those housed in pairs have been shown to have more severe OA lesions than those housed alone, presumably due to increased activity level. 54,72
Despite the occurrence of spontaneous OA in the Dunkin-Hartley strain, it is still frequently used for surgical models of OA. Surgically induced lesions (ie, by partial medial meniscectomy) develop much more rapidly than spontaneous lesions, with loss of toluidine blue staining as early as 3 days postsurgery, superficial cartilage degeneration with measurable fibrillation by 4 weeks postsurgery, and lesions extending to the deep zone of the cartilage by 3 months postsurgery. 11,113 It must be remembered that spontaneous lesions will develop and progress in untreated or sham-operated limbs. This has been proposed as an advantage to this model, as the same intervention could be evaluated for its effects on the rapidly developing induced lesions as well as the more slowly developing naturally occurring disease in the same animal. 12 If this is to be done, surgery should ideally be performed and the therapeutic intervention initiated prior to development of significant spontaneous lesions 12 ; however, to address the issue of skeletal maturity, the earliest age at which it is recommended to include guinea pigs in any study is 6 months of age, 122 by which time spontaneous lesions may already be established.
The stifle joints of guinea pigs are small enough for a frontal section to be placed on a single slide, allowing evaluation of the entire joint in correct anatomic orientation. 12,72 Spontaneous disease tends to be consistently located in the central portion of the joint, while the location of surgically induced lesions can vary. Thus, initial macroscopic evaluation of joints with surgically induced lesions is recommended (see Supplemental Table 2) to make sure that the appropriate (most severely affected) region is selected for histologic scoring. 72 Although 5 semiquantitative histologic scoring systems have been used in the guinea pig, the modified Mankin method is the most commonly used, and this is the system recommended by the OARSI guinea pig working group in its 2010 report. 72 This system scores articular cartilage structure (0–8), proteoglycan content (0–6), cellularity (0–3), and tidemark integrity (0 or 1), with an optional osteophyte score (0–3; Suppl. Table 1). A separate scoring system for evaluating synovitis is also recommended for use with surgical models (Suppl. Table 1). Evaluation of the meniscus is not included in this scoring system, but it is of note that guinea pigs, like rats and mice, are known to develop significant ossification of this structure. 72 It should be recognized that MRI protocols have been reported allowing for longitudinal assessment of the progression of naturally occurring and surgically induced cartilage degradation 18,41,137 and are likely to gain in popularity; however, the accuracy of MRI decreases with advanced OA changes, 18 and histopathology remains the gold standard for characterizing joint lesions.
Rabbit
Rabbits have been popular for chemically and surgically induced models of OA for decades (see early studies 15,38 ) despite the marked differences in joint biomechanics and gait when compared to humans. 31,54 Rabbit stifles have a significantly higher flexion angle than that of humans 127 and, unlike other species (including humans), naturally load the lateral compartment of the femorotibial joint rather than the medial. 75,136 As a result, although surgical interventions in the medial compartment do result in medial lesions, more severe lesions are seen if the intervention is on the lateral side. 75 A recent retrospective study of client-owned rabbits demonstrated radiographic evidence of naturally occurring OA in ∼50% of animals older than 6 years and >70% of animals older than 9 years, but it did not specifically localize lesions medially or laterally within the affected joints. 9
In addition to biomechanical differences, important structural differences also exist between rabbit and human joint tissues. Rabbit cartilage is ∼10 times thinner than human cartilage (0.3–0.7 vs 2–3 mm) but has much higher chondrocyte density. 116,122 The distribution of cartilage zones is quite different between the 2 species, particularly because the thickness and cellularity of the transitional and radial zones are highly variable in the rabbit, even among sites within the same joint. 116 The rabbit meniscus is also more cellular; it has less vascular penetration than the human meniscus; and it can heal rapidly. 27,54,75 Rabbit cartilage has been reported to exhibit spontaneous healing, particularly in young animals (up to 20 weeks of age). 2,31,54 This may be one reason why induced OA lesions are more severe and develop more rapidly in older rabbits. 117 The issue of spontaneous repair is of particular concern when using rabbits to evaluate methods of cartilage repair. 34,82 Critically sized defects in this species (ie, those that do not exhibit spontaneous healing) are officially reported to be 3 mm, but larger defects of 4 to 5 mm are recommended. 2,31 It is important to note that for any OA model, rabbits older than 8 to 9 months of age should be used to ensure skeletal maturity; ideally, this should be verified by radiographic examination. 9,122
For general histologic evaluation of rabbit OA models, the 2 most commonly used scoring systems are the modified Mankin and the OARSI (Pritzker) scheme. When these were directly compared by 2 groups, each using an ACL transection model in rabbits, both systems performed well, although in 1 study the OARSI score was preferred because of better performance separating control and osteoarthritic samples from each other. 34,126 The OARSI rabbit working group acknowledged the utility of both these systems in its 2010 report, but it did not specifically evaluate either of them. Its recommended system, specific to the ACL transection model, includes semiquantitative scores for 4 parameters: safranin O–fast green staining (0–6), cartilage structure (0–11), chondrocyte density (0–4), and cluster formation (0–3; Suppl. Table 1). 77 It was recommended that synovium and meniscal changes be evaluated separately and assigned scores (Suppl. Table 1). The rabbit stifle is large enough that macroscopic evaluation of lesions has utility; the working group recommended use of the Outerbridge classification for this purpose, with an additional grade assigned based on the length of fissures/erosions (1–7; Suppl. Table 2). 77 An entirely separate set of 5 histologic grading systems exists for evaluating cartilage repair models in the rabbit. A recent report evaluated these and found that they performed equivalently. 109 However, while the results of each were well correlated with cell number and total DNA content, they did not correlate well with proteoglycan content in the repair tissue. Thus, the investigators recommended using biochemical analysis in addition to histopathology when evaluating repair tissue in OA models. 109
Large Animal Models: Dog, Sheep/Goat, and Horse
Advantages to large animal models of OA include anatomic similarity to humans (particularly joint size and cartilage thickness), widespread occurrence of naturally occurring primary idiopathic and secondary (posttraumatic) OA, and amenability to diagnostic imaging, repeated synovial fluid collection, arthroscopic intervention, and postoperative management. 31,54,136 Thus, although small animal models are the most commonly used in initial or screening studies, large animal models are thought to generate more clinically relevant data and are generally required for regulatory approval of therapeutic interventions. 31,54 Disadvantages of large animal models are primarily related to cost, handling challenges, longer time to maturity, slower (and potentially more variable) progression of disease, and ethical considerations, particularly related to public perception. 4,136
Dog
Both surgically induced and naturally occurring models of OA have been widely studied in dogs, and this species is considered by some to be the closest to a gold standard model currently available in terms of anatomic similarity, disease progression, and translation of outcomes to humans. 54,92,103 One major advantage of dogs as models of OA, compared to other species, is their amenability to postoperative management strategies, including bandaging/splinting and various exercise regimens (ie, land or water treadmill). These interventions may be of primary interest in themselves as therapies for preventing OA development and/or progression after an insult, 53 or they may be used to accelerate degenerative changes after experimental intervention. 146 It is of note that although experimental models in this species generally utilize the stifle, 73,92,121 naturally occurring disease is also common in the hip and elbow. An estimated 20% of adult dogs are affected by OA, resulting in a large pool of client-owned animals that may be available for enrollment in therapeutic trials. 31,54 Although interindividual variability is unquestionably higher in a population of this type than in purpose-bred dogs (ie, of the same breed, sex, and age), translatability of results to human patients may be improved.
In part due to the widespread clinical incidence of OA in dogs, a variety of antemortem diagnostic monitoring protocols have been developed, several of which have been utilized (and often further refined) for experimental models. These include gait and kinematic analyses, 94,103 as well as imaging techniques. Of the latter, MRI in particular has long been of interest because of its ability to detect OA-related changes much earlier than radiography. An early study using thick (1 cm) slices and a very low-field magnet (0.15 T) reported pathology in the meniscus and joint capsule, as well as development of osteophytes, in an ACL transection model 8 weeks before changes were visible radiographically. 129 Modern studies using high-field magnets (ie, 7.0 T) may detect subtle changes in cartilage area that correlate with histologic results, 118 and these findings may eventually translate into clinical protocols for noninvasive monitoring of OA progression.
Despite progress in noninvasive longitudinal monitoring, histology remains the gold standard for evaluation of experimental outcomes in dog OA models. Historically, the Mankin scoring system (or modifications thereof) has been most commonly used, 73,92,146 but in response to perceived shortcomings of this system, the OARSI dog working group proposed and validated a new comprehensive scoring system in its 2010 report. 32 This system incorporates macroscopic (based on the Outerbridge system; see Supplemental Table 2) and microscopic scoring of cartilage, synovium, and meniscus, with consideration for both the severity and the extent of pathology in each structure. Microscopic grading of cartilage incorporates 6 categories: cartilage structure, chondrocyte pathology, proteoglycan staining, collagen integrity, tidemark, and subchondral bone plate. Each evaluated histologic section is given a category score reflecting severity; the area of the section affected at that category (none, focal, multifocal, or global) determines the numeric score assigned. If multiple focal areas of pathology exist within 1 section, scores are combined (see Supplemental Table 1). 32 The working group recommended that multiple, sagittally cut osteochondral sections from each joint compartment be evaluated, in addition to synovium from at least 3 locations and at least 9 sections of each meniscus (3 each from the caudal, middle, and cranial thirds). The reported advantages of this system were its versatility and comprehensive nature, but its complexity was reflected in lower consistency in scoring among novice users. 32
It should be noted that while there are many anatomic similarities between the dog and the human, important differences in biomechanics and gait do exist, particularly in the much greater flexion angle in the dog stifle. 127 Dog cartilage is slightly less than half the thickness of human cartilage, allowing for the creation of partial-thickness lesions, although this is rarely done in practice. 2 The critically sized cartilage defect for this species is reportedly 4 mm, although lesions ≥5 mm are recommended if studying cartilage repair. 31 Skeletal maturity can vary widely by breed and sex (9–18 months of age), and radiographic confirmation of growth plate closure is recommended prior to enrolling individuals in any study. 32
Sheep/Goat
Sheep and goats are favored as large animal preclinical models for OA in part due to ease of handling and relatively low cost (compared to other large animal species) and in part due to marked similarity between their stifle anatomy and that of the human knee. In particular, both the sheep and the goat are considered to be among the best models (from an anatomic perspective) for ACL repair, and goat menisci are anatomically most similar to humans among species used to study this structure. 127 Additionally, the cartilage thickness (up to 2 mm) in these species allows for the easy creation of either full- or partial-thickness chondral lesions, although significant variation in thickness exists between individuals and anatomic locations, and sheep have been reported to have much thinner cartilage than goats. 2,45 This property—along with the ability to create multiple, relatively large osteochondral defects (6- to 12-mm diameter) in a single joint—makes sheep and goats popular choices for testing cartilage repair strategies. 21,31,60,76,108,131 It should be noted that skeletal maturity may not be reached until after 2 years of age in these species and that inclusion of younger animals in any OA model (including those evaluating cartilage repair) is not recommended, as healing capacity may be greater in younger animals. 80,107,122 Additionally, follow-up of at least 6 months, preferably a year, is considered key for definitive studies of cartilage repair, 31 as promising short-term results are often not supported over the longer term (see example of this in a goat meniscectomy model 104 ).
Spontaneous OA is not commonly reported in sheep or goats, 80 although a recent study found histologic and MRI evidence of cartilage degeneration in the hips of young adult sheep (2.5–3 years of age) that were clinically and macroscopically normal. 148 Goats can develop profound joint degeneration secondary to infection with caprine arthritis–encephalitis virus and should test negative for this disease prior to study enrollment. The most frequently used surgical models in both species involve destabilization or removal of a meniscus; forced exercise after surgical intervention has been demonstrated to worsen OA lesions when compared to surgery alone. 7,8,22 In contrast to that of other species, transection of the ACL alone results in only mild cartilage pathology, 80,87,102 and an attempt to mimic posterolateral instability via transection of the popliteus tendon, lateral collateral ligament, and lateral joint capsule transection in goats resulted in no macroscopic and minimal histologic lesions. 107 These findings emphasize that biomechanical and anatomic differences do exist between sheep/goats and humans that may limit translation of some experimental results.
Macroscopic and microscopic scoring of OA changes is recommended in sheep and goats. Although a variety of macroscopic scoring systems exist, the OARSI sheep/goat working group recommended use of a 4-point semiquantitative scale for cartilage lesions in its 2010 report. 80 A total joint score is generated by combining separate scores from the medial and lateral tibial plateaus and the medial and lateral femoral condyles. Similarly, osteophytes should be graded by quadrant (0–3; Suppl. Table 2). Synovium should be scored according to the scheme recommended for use in the dog (Suppl. Table 2). 32,80 For microscopic scoring of lesions on coronal sections of the tibial plateaus and femoral condyles, a modified Mankin method was recommended, in part because this is already the most commonly used system in these species. The recommended scoring algorithm for cartilage includes structure (0–10), chrondrocyte density (0–4), cell cloning (0–4), toluidine blue staining (0–4), and the tidemark (0–3). 80 An additional grade of 0–5 is assigned for the percentage surface area of the tibial plateau or femoral condyle that is affected, which mimics the “stage” assigned in the OARSI (Pritzker) scoring system (Suppl. Table 1). A separate score is given to microscopic synovial changes (assessing intimal hyperplasia, inflammatory cell infiltration, subintimal fibrosis, and vascularity, each on a 0–3 scale; Suppl. Table 1). This system was shown to be reproducible, but it performed better when applied by experts rather than novices. 80 A recent study raised concerns about the current use of this system when comparing different OA models by demonstrating, for example, that the same global score can be achieved with focal deep cartilage damage and minor matrix degeneration, or by more modest structural damage but extensive cellular changes and proteoglycan loss. 102 As a result, the authors suggested that evaluation of the individual parameters of the modified Mankin system may yield more useful information about pathophysiology of disease than the global score alone. 102 It should be noted that macroscopic scoring systems exist specifically for evaluating cartilage repair in sheep and goats. Five of these were recently compared and were demonstrated to correlate well with MRI findings (9.4 T), although, of the 5, the system proposed by the authors of that report was found to be the most highly correlated to the clinically relevant MRI parameters of “total score” and “defect fill” and to have the best intra- and interobserver reliability. 52
Horse
Interest in developing OA models in the horse is driven as much by the clinical importance of the disease in this species as by its utility as a translational model for human disease. Both idiopathic primary OA and posttraumatic OA related to athletic use occur in the horse, and the challenges and expectations that exist regarding early diagnosis and the development of effective treatments that allow return to full function are similar to humans. As with athletic humans, the location of naturally occurring OA in horses often depends on their use; for example, disease is most common in the carpus and metacarpophalangeal joints of racehorses, but it is most often recognized in the stifle joints in Western performance horses. 96
Articular cartilage in the stifle of the horse has been reported to be the most similar of any domestic species to the thickness of human knee cartilage and to have similar cellular structure, biochemical makeup, and biomechanical properties. 2,45,88 The calcified cartilage is easily identified at the time of surgery, allowing the decision to be made to remove or retain this layer depending on the scientific question being asked. 45,95 Along with the ability to perform second-look arthroscopy, serial synovial fluid sampling, and extensive imaging (MRI, CT, radiographs, etc), as well as to incorporate postoperative exercise regimes and rehabilitation protocols, this has made the horse a popular model for evaluating articular cartilage repair modalities 44,135 (see review 95 ). Critically sized defects in this species are reported to be 9 mm, although defects of 12 to 15 mm are most commonly used to closely mimic the large cartilage defects reported in human OA. 2,31,95 It is important to note that the duration of these studies should be at least 8 to 12 months, as failure of repair at long-term follow-up is not uncommon even if short-term results are promising. 95
Interestingly, although the cartilage of the stifle is most similar to that of the human knee, the most frequently published surgical model of OA in the horse involves creation of an osteochondral defect in the middle carpal joint. 43,46,65 A similar, though nonterminal, osteochondral fragment model utilizing the metacarpophalangeal joint was recently reported, 20 and an impact model of the medial femoral condyle has been described. 17 However, the preponderance of experimental data in the carpus led to recommendations for histologic evaluation specific to this joint/model by the OARSI horse working group in its 2010 report. 97 This modified Mankin score evaluates 5 parameters: chondrocyte necrosis (0–4), cluster formation (0–4), fibrillation/fissuring (0–4), focal cell loss (0–4), safranin O–fast green stain uptake (0–4; Suppl. Table 1). Microscopic scoring of synovial membrane alterations was also recommended for this model, incorporating evaluation of cellular infiltration, vascularity, intimal hyperplasia, subintimal edema, and subintimal fibrosis (each on a 0–4 scale). A different grading scheme was recommended for evaluation of naturally occurring OA lesions in the metacarpophalangeal joint. This included evaluation of osteochondral lesions (0–4), subchondral bone remodeling (0–3), and osteochondral splitting (0–3; Suppl. Table 1). 97 Some concern has been raised about the applicability of a modified Mankin score to the horse in certain circumstances, especially for diffuse lesions; in these cases, the OARSI (Pritzker) scoring system may be preferred. 17 Macroscopic scoring systems for the carpal chip model and naturally occurring metacarpophalangeal joint disease have also been recommended (Supplemental Table 2). 97 It is worth reiterating that substantial effort has been placed into the development of advanced imaging modalities and identification of fluid-based biomarkers of disease in the horse, 43,65,99,140 largely because of the burden of naturally occurring disease in this species. It is likely that as these techniques are refined, they will become more widely used and may replace histologic evaluation, at least for intermediate study time points, to reduce the cost of horse models of OA.
Other Models: Zebrafish, Pig, Cow, and Nonhuman Primates
In addition to those described above, a number of other species have been utilized (or proposed) as animal models of OA. Specific histologic guidelines have not been published for these species to date, but the use of the Mankin and OARSI (Pritzker) scoring systems has been reported in recent studies 56 ; additionally, study-specific schemes have been used. 100 Macroscopic grading of lesions is also widely reported, either alone or in combination with microscopic scoring, typically utilizing a modification of the Outerbridge classification system. 86
Zebrafish
Although zebrafish do not develop synovial joints as mammals do, they have been widely used as models to study skeletogenesis, and it is in this capacity that they have recently been proposed as a potential model for OA. 101,142 Candidate genes associated with OA in human studies have recently been demonstrated to be expressed in developing zebrafish, and a transgenic col10a1 reporter line has been developed that allows easy identification of hypertrophic chondrocytes, which may play a role in the development of OA. 101 Zebrafish will certainly not replace mammalian models of OA but may complement the use of transgenic mice as a screening tool for certain targeted therapeutic compounds.
Pig
Both commercially bred and miniature pigs have been utilized as models of OA, with the former being popular for ex vivo studies because of the availability of abattoir specimens. 139,144 Surgical models in miniature pigs include ACL transection 105 and medial meniscectomy. 57 Anatomically, the stifle of the pig is similar to the human except that the cruciate ligaments are longer and the meniscus is wider (and stiffer). 127,130 Cartilage thickness is also similar, and partial- and full-thickness defects can be created, with a critical defect size of >6 mm. 2 Striking similarities in gastrointestinal physiology and the immune system between pigs and humans make this species an attractive option for testing pharmaceutical and biologic interventions. 57,136 Age at skeletal maturity for miniature pigs has been variably reported as 10 to 12 months 2 or 18 to 24 months. 136
Cow
As for domestic pigs (above), bovine cartilage is most commonly used in ex vivo studies due to the availability of abattoir samples. 35,40,56 Naturally occurring OA is a recognized entity in this species, and lameness is a major cause of culling. In bulls, lesions are noted particularly in the stifle and tarsus, 10,59,119 and the bovine patella has been proposed as a model for investigating structural and mechanical changes in early OA. 56 Cartilage thickness, cellularity, and zonal anatomy of the bovine patella are reportedly very similar to those of the human femoral condyles 56 ; in contrast, when the lateral tibial plateaus of both species are compared, bovine cartilage is thinner and more cellular than that of humans and is made up of thinner superficial and transitional zones but a much thicker deep zone. 116 This demonstrates the importance of species-specific site differences when characterizing changes observed in various OA models. Biomechanically, the bovine meniscus is similar to human but, anatomically and biochemically, is considered a less suitable model than that of the sheep. 130
Nonhuman Primates
A number of nonhuman primates—including cynomolgus macaques, rhesus macaques, and baboons—have been utilized as models of OA. Although experimental models of disease do exist, 85 a major advantage of these species is the development of naturally occurring OA in multiple joints that very closely mimics the onset and progression seen in aging humans. 23,24,55,86,120 Disease severity and age are closely correlated in this model, but the spectrum of disease across ages mimics that seen in humans, with some younger animals affected with moderate disease (particularly young males) and some elderly individuals spared entirely. 55,86 Although the joint biomechanics of nonhuman primates do differ from humans, they are more similar than those of strict quadrupeds. 55,120 Additionally, similarities in reproductive physiology allow for the investigation of the effects of hormones/reproductive status on the development of OA. 16,55,86 However, despite these advantages as a model of OA, significant disadvantages related to cost, availability, ethical issues, and public perception do exist and generally preclude the widespread use of these species.
Ethical Considerations and Reporting Guidelines
Each animal model of OA presented here has inherent advantages and disadvantages. These must be carefully weighed in the context of the specific scientific question being asked when selecting the most appropriate model for a given study. However, regardless of the animal model utilized, similar considerations for ethical animal use, as well as careful observation of reporting guidelines, should be followed. Ethical considerations center around the concept of the 3 R’s: reduce the number of animals used; refine protocols and procedures to minimize suffering and maximize the value of the outcomes; and replace animal studies where a viable alternative approach exists. 75,81,84 It is important to note that reducing the number of animals at the expense of an adequately powered study is not an appropriate application of these guidelines. A realistic estimation of the expected difference between groups is a critical component to producing a viable power calculation while designing an experiment; the use of standardized outcome measures (eg, the recommended histologic grading schemes described above) and reporting guidelines across studies will greatly aid this process. It should not be forgotten that the inclusion of appropriate controls is crucial. The use of a nonoperated contralateral limb as a control is common but not always appropriate given the potential for gait alterations as a result of the intervention in the other limb. A separate sham- or nonoperated control group is generally preferred, although there is some debate as to which of these is more appropriate. 136
When an animal study is being reported, sufficient information should be provided such that an informed reader can understand what was done and can judge the appropriateness and biological relevance of the conclusions reached and so that the experiment can be independently replicated. Unfortunately, a recent survey demonstrated that <60% of published reports of animal research stated an objective and described the number and characteristics of the animal model used. 67 The ARRIVE (Animals in Research: Reporting In Vivo Experiments) guidelines were developed to improve completeness and transparency of reporting and are now recommended by leading journals for submitted manuscripts that involve animal research. 66,81 This checklist includes 20 items, among them a clearly stated objective (or objectives), detailed information about experimental animals (including number, species, strain/breed, sex, age, weight, and genetic background) and housing/husbandry conditions, and a complete description of statistical methods used in analysis. 66 Similar guidelines have also been published by the National Research Council’s Institute for Laboratory Animal Research (available at http://www.nap.edu/catalog.php?record_id=13241). Detailed descriptions of the experimental animals is particularly important for OA research since, as mentioned above, disease outcomes can vary widely among species, age groups, sexes, and genetic backgrounds even when the same experimental approach is selected. 36 Housing/husbandry conditions can also significantly affect disease outcomes; special consideration should be given to reporting diet and exercise, the latter of which may be affected by flooring, individual vs group housing, and other factors, in addition to imposed regimens. Two items in the ARRIVE checklist specifically deal with justification of the selected animal model and recognition of its limitations; thus, consultation of these guidelines during the design phase of an experiment is highly recommended. 81
Conclusions
Animal models of OA have undoubtedly improved our understanding of the pathophysiology of this disease over the past 50 years and will continue to be a valuable tool in the future. Due to the heterogeneous nature of OA, it is not surprising that no single animal model of disease can perfectly recapitulate all aspects of human OA, but the wide variety of models available makes it likely that one or more models can be successfully applied to most relevant questions. In general, smaller animals may be favored for investigations of basic pathophysiology and for early screening of therapeutic interventions, but large animal models will be required for verification of findings prior to moving to human clinical trials. Naturally occurring disease is considered a better model of human primary idiopathic OA, while surgical models of disease may more closely recapitulate posttraumatic OA in humans; both have a valuable place in OA research. Considerations including cost, availability, housing, and length of the experiment need to be weighed against the outcome measures required to address the question being asked. This, along with mindfulness of ethical and reporting guidelines, will help researchers to select the animal model of OA that will maximize the impact of their results. Regardless of the model chosen, adoption of standardized scoring systems for histology and other imaging modalities (within each species, if not universally) should be strongly encouraged within the OA research community. Such standardization does not preclude the parallel use of specialized scoring systems as needed, but it will maximize the utility of each study by allowing for meaningful comparisons across studies.
Footnotes
Acknowledgement
Thanks to Dr Cathy Carlson for providing images used in Figures 3–
, as well as for critical reading of the manuscript and helpful suggestions.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
