Abstract
The use of standard nomenclatures for describing the strains, genes, and proteins of species is vital for the interpretation, archiving, analysis, and recovery of experimental data on the laboratory mouse. At a time when sharing of data and meta-analysis of experimental results is becoming a dominant mode of scientific investigation, failure to respect formal nomenclatures can cause confusion and errors and, in some cases, contribute to poor science. Here, the authors present the basic nomenclature rules for laboratory mice and explain how these rules should be applied to complex genetic manipulations and crosses.
These ambiguities, redundancies, and deficiencies recall those attributed by Dr Franz Kuhn to a certain Chinese encyclopedia called the —
Pathologists are trained not only to use defined medical nomenclature but to use words carefully and concisely. This precept extends to the specific animal species we are dealing with; to define accurately the signalment—namely, the species (and scientific name if appropriate), strain, age, sex, and any other pertinent information that would help the pathologist make an accurate interpretation (diagnosis/phenotype) of the samples under investigation. Many textbooks attempt to standardize pathology nomenclature as do study groups that focus on organ systems of specific types or groups of diseases (e.g., ref. 5 ), and there are now many examples of disease and diagnostic glossaries, controlled vocabularies, and thesauri (e.g., refs. 3,15 ). The recent development of disease and phenotype ontologies 1,2,8 –10 has provided a more structured framework for nomenclature that, as well as being amenable to sophisticated computation, requires strict semantic discipline. Linking such ontologies to data capture systems forces pathologists to use fixed nomenclature, which automatically ensures accurate spelling and coding. 12,14
The laboratory mouse has, without question, become the premier animal model used in biomedical research today. Genetic engineering technology, 13 combined with advanced genetic resources such as the Collaborative Cross, 4 has expanded the value of the mouse far beyond its traditional role in toxicology and other basic and applied research applications. Despite this, analyses of these mice remain fundamentally the same as for all other species.
A major problem with virtually all biomedical journals, including
Another more common problem is the use of abbreviations as if they are representative of all allelic mutations available for a particular gene. Many image Web sites illustrate lesions in mice that carry the

Explanation of the various components of the gene symbol for the targeted mutation (tm, also commonly called knockout or null mutation) for the apolipoprotein A-1 gene.
Examples of Different Types of Mutations in the Transformation-Related Protein 53 Gene to Illustrate Various Approaches to Gene Symbols
There are currently 222 allelic mutations reported for this gene in mice (http://www.informatics.jax.org; March 15, 2010). Human gene symbols are all in capital letters. Note in this case the mouse and human symbols are not quite the same. Protein symbols are in capital letters for both species but not in italics. In humans, allele designations are limited to an optimum of 3 characters using only capital letters or Arabic numerals. The allele designation is written on the same line as the gene symbol, separated by an asterisk (e.g.,
Committees to Standardize Nomenclature
There are specific committees set up to standardize the nomenclature for mice (International Committee on Standardized Genetic Nomenclature for Mice and Rats, http://www.informatics.jax.org/mgihome/nomen/gene.shtml#genenom). In 2001, the International Committee on Standardized Nomenclature for Mice and the Rat Genome and Nomenclature Committee agreed to establish a joint set of rules for strain nomenclature, applicable to strains of both species. If this discussion is difficult to follow, there is an online tutorial that provides more detailed information on this subject (http://jaxmice.jax.org/support/nomenclature/tutorial.html). Although many think these are new ideas and rules, they have actually been evolving for a very long time. Earlier versions of the rules, particularly for mice, have been published since 1941, 11 and a history can be found on the Mouse Genome Informatics Web site listed above. One of the major goals initially was to prevent renaming or misnaming of strains and spontaneous mouse mutant locus designations. As mammalian genetics evolved as a discipline, the nomenclature evolved to accommodate the ever increasing complexity of these animals as well as the biotechnology that made these advances possible.
Sources for Assistance for Nomenclature Issues
Nomenclature, both the current gene name and the symbol for the specific allelic mutation under investigation, can be obtained from the vendor’s Web site if mice were purchased (The Jackson Laboratory, http://jaxmice.jax.org/query; Taconic, http://www.taconic.com; Charles River, http://www.criver.com). The nomenclature can be checked against the most current nomenclature standards online (go to http://www.informatics.jax.org, search Genes, then Access Data: Genes and Markers Query, and enter the full name or gene symbol for the mutation). Last, a symbol proposed by the investigator can be applied for if it is a potentially novel gene, strain, allele, or construct by contacting the nomenclature committee directly. If in doubt, the scientists who maintain the Mouse Genome Informatics Web site are available to assist with nomenclature questions (click “User Support” at the bottom of the Mouse Genome Informatics Home Page or go to the page directly: http://www.informatics.jax.org/mgihome/support/mgi_inbox.shtml).
Mouse and rat nomenclature Web sites are listed above. Human gene nomenclature is governed by the Human Genome Organisation Nomenclature Committee. Rules and names are available online (http://www.genenames.org/). These should be used to ensure that the most accurate and current gene/protein names and symbols are being used.
Inbred Strain Designations
For laboratory mice, inbred strain names are all in capital letters. After the strain name, and equally important, are the laboratory (investigator) and institutional codes that designate substrains. For example, NOD/ShiLtJ is a strain designation that reveals that the nonobese diabetic (NOD) inbred mice originated from inbreeding the Cataract Shionogi (CTS) strain (originally outbred Jcl:ICR mice) selected for on the basis of an elevated fasting blood glucose level in cataract-free mice. The holder of the mice at that time, Shionogi, is designated in the nomenclature of the current strain as Shi. NOD substrains, originally available in Japan, were distributed in the early 1980s to Australia and the United States, and eventually breeder pairs were sent to Dr E. Leiter (investigator’s laboratory symbol: Lt) at The Jackson Laboratory (J; Fig. 1; http://jaxmice.jax.org/strain/001976.html, 9 Feb 2010).
Strain Abbreviations
Abbreviations for commonly used strains are also standardized (see http://www.informatics.jax.org/mgihome/nomen/strains.shtml#hybrids) (Table 2). B6 specifically refers to the C57BL/6J strain, although this is commonly used in the literature to refer to all of the C57BL/6 substrains, many of which carry unique mutations and therefore can be different from each other. By contrast, BALB/cJ mice are abbreviated C, and BALB/cByJ mice are CBy. Hybrids or incipient congenics, where a mutated gene is being transferred to another strain, are designated with a semicolon between the strain abbreviations, such as B6;129. This segregating background is in sharp contrast to a congenic strain where the semicolon is replaced by a period to indicate the congenic procedure has been completed (10 backcrosses, N10, onto the new strain). Such mice are designated B6.129. Although 6 backcrosses (N6; an incipient congenic) is commonly accepted by many journals to be adequate, speed congenic technology has clearly shown that this is not adequate. 6
Symbols for Various Types of Inbred Mouse Strains and Stocks
See the tutorial for more information (http://jaxmice.jax.org/support/nomenclature/tutorial.html).
Mutant Gene Symbols
Gene symbols for mouse mutant gene loci were written historically with the first letter capitalized and subsequent letters in lowercase for dominant or semidominant mutations in mice, whereas recessive mutation gene symbols are written in all lowercase. This persists for the spontaneous allelic mutations that are now listed as superscripts after the known gene symbol. The gene symbols for mice have the first letter capitalized followed by lowercase letters. Human genes have all letters in uppercase, which makes them easily separated from mouse gene symbols. Gene symbols are written in italics. For example, the mouse hairless and rhino mutations are on 2 different inbred strains and are written as HRS/J-
Specific nomenclature can be found on the Mouse Genome Informatics Web site, or curators can be contacted directly. Nomenclature standards change over time, as do many of the gene symbols. The nomenclature committees and respective Web sites maintain the historical synonyms, so it is possible to figure out which allelic mutation was studied historically relative to current studies if the original work was annotated correctly. Strict adherence to these nomenclature standards will allow work to be fairly compared and, more important, reviewed accurately.
Mouse genetic nomenclature can get very complicated for some of the highly specialized stocks of genetically engineered mice. Help should be solicited for correctly designating these lines. However, basic knowledge of the difference between strains and mutations needs to be understood.
Nomenclature for Specific Genetic Tools
Historically, inbred strains have been brother/sister mated in a carefully controlled manner for 20 generations. The new Collaborative Cross mice are more complicated, and 25 or more generations may be needed to inbreed them adequately.
4
Congenic mice, created by moving a single mutated gene from 1 strain to another, also involve 20 crosses but are more complicated. In this case, the strain carrying the mutation is crossed with another strain, and their F1 offspring are intercrossed to produce F2 mice, 25% of which carry the mutant gene (if recessive). These mutants are then crossed to the strain it is being backcrossed onto to repeat this again for a total of 10 times. At the end of this process, the mutation should be on a background very close to the inbred strain it was being moved onto. The nomenclature for congenic strains reflects the new inbred strain the mutation was moved onto first, followed by a period and either the strain it was moved from or simply Cg for congenic, followed by the allelic mutation that was moved (Table 2). For example, CByJ.Cg-
Conclusions
At a time when biological data are increasingly archived, searched, recovered, and analyzed computationally, adherence to standard forms of nomenclature is vital. Computers are dumb; unlike journal editors, they are unable to deal with the nuances and ambiguities of natural language, and failure to make compliance with standards second nature generates more work, limits the utility of publications and database submissions, and, as shown above, can produce bad science.
Footnotes
The authors declared that they had no conflicts of interests with respect to their authorship or the publication of this article.
This work was supported by the European Commission (contract number LSHG-CT-2006-037811; CASIMIR, to PNS) and the National Institutes of Health (CA089713, RR17436 to JPS).
