Abstract
Neurobehavioral and pathological evaluations of the nervous system are complementary components of basic research and toxicity testing of pharmaceutical and environmental chemicals. While neuropathological assessments provide insight as to cellular changes in neurons, behavioral and physiological methods evaluate the functional consequences of disruption of neuronal communications. The underlying causes of certain behavioral alterations may be understood, but many do not have known direct associations with specific brain pathologies. In some cases, however, rapidly expanding mouse models (transgenic, knock-out) are providing considerable information on behavioral phenotypes of altered pathology. Behavior represents the integrated sum of activities mediated by the nervous system, and functional tests used for neurotoxicity testing tap different behavioral repertoires. These tests have an advantage over pathologic measures in that they permit repeated evaluation of a single animal over time to determine the onset, progression, duration, and reversibility of a neurotoxic injury. Functional assays range from a screening-level battery of tests to refined procedures to tap specific forms of learning and/or memory. This article reviews common procedures for behavioral toxicity testing and provides examples of chemical-specific neurobehavioral-pathological correlations in order to inform interpretation and integration of neuropathological and behavioral outcomes.
Introduction
The field of neurotoxicology has emerged from the integration of toxicology, pharmacology, psychopharmacology, and experimental psychology and involves the study of changes in the function and/or structure of the nervous system as a result of chemical exposures or other environmental influences, and an interpretation of the consequences and adversity of those changes. The nervous system is one of the most complex organs in the body, encompassing different cell types (e.g., neuron, glia), anatomy (central, peripheral), structural characteristics (e.g., size, spatial configurations), synaptic functions (inhibitory, excitatory), and neurotransmitters (e.g., acetylcholine, GABA) and possessing high energy requirements (ion channels, action potentials). The specialized metabolic and physiological features of the nervous system convey unique vulnerabilities to toxic compounds that may act on multiple sites in different ways (Moser et al. 2008).
Early on, histopathological changes were considered the “gold standard” that defined the field of neurotoxicology, until the realization that toxicants can also alter nervous system function, for example, behavior, in profound and varied ways emphasized the need for other types of evaluations. Behavior in the broadest sense is multifaceted and the result of integration at multiple levels. Behavioral effects may be a reflection of changes in nerve cell communication and integration as well as the morphological alterations that may be measured by the pathologist. Given that behavior represents the integration and integrity of the nervous system, it is generally considered a sensitive indicator, and perhaps the ultimate assay, of neuronal function (Kulig et al. 1996; Norton 1978; Tilson 1990; Tilson and Mitchell 1984; Whishaw, Haun, and Kolb 1999). While the term CNS pharmacology or toxicology often refers to “higher” central nervous system (CNS) functions, for example, cognitive processes or electrical brain activity, it is important to also consider the peripheral nervous system (PNS) and “lower” CNS function, for example, reflexes, innate responses, and behaviors that are critical for survival.
History of Behavioral Testing in Pharmacology and Toxicology
For the pharmaceutical and chemical (especially pesticide) industries, the testing of chemicals for nervous system effects evolved on somewhat parallel tracks. The systematic evaluation of mice to determine CNS side effects of drugs typically employed a series of observations and manipulations that became known as the Irwin screen (Irwin 1962, 1968). Its use became ingrained in the pharmaceutical industry. In early chronic toxicity studies, cage-side observations of behavior became a method of detecting early signs of toxicity, although their use for the study of behavior itself was not widespread (Arnold et al. 1977; Barnes and Denz 1954; Boyd 1959; Fox 1977; Ruffin 1963). Following the recommendation of several expert panels (e.g., National Academy of Sciences 1975) and government and academic scientists (e.g., Brimblecombe 1979; Mitchell and Tilson 1982), the U.S. Environmental Protection Agency (U.S. EPA) developed and published guidelines for several behavioral tests, including a series of tests loosely based on the Irwin screen, called the functional observational battery (FOB) (Sette 1989). These were eventually codified in the U.S. EPA Human Health 870 Series Test Guidelines (U.S. EPA 1998c) and were harmonized with several test guidelines of the Organization for Economic Cooperation and Development (OECD; 1995, 1997). Similar testing approaches were recommended for food chemicals in the Food and Drug Administration “Red Book” (Sobotka et al. 1996), and behavioral testing of potential drugs is now required by the International Conference on Harmonisation (ICH) in the current S7A guidelines (ICH 2000).
On yet a third track, other batteries of behavioral tests have been developed in academia for use in describing nervous system alterations in genetically manipulated mice. In such studies, behavior serves as a marker for specific genetic alterations, and in addition, transgenic approaches can be used to study the genetic bases of behavior (normal and abnormal). The need for standardized approaches to describe these differential profiles has been espoused, and several such batteries are currently in use (Van der Staay and Steckler 2001, 2002). While these developments have taken place in parallel with the increasing use of behavioral assays in pharmacology and toxicology, there has been only little overlap. Mouse models can, however, provide valuable information on the neurobiological basis for specific behaviors and, thus, inform interpretation of behavioral alterations (Anagnostopoulos et al. 2001; Sousa, Almeida, and Wotjak 2006).
Testing Authorities
Preclinical testing of drug candidates is currently guided by the harmonized test guidelines, known as ICH S7A (ICH 2000). The first tier of preclinical testing is considered core battery studies and includes specifically motor activity, behavioral changes, coordination, sensory/motor reflex responses, and body temperature. Both FOB and Irwin protocols are applicable at this level of testing. Follow-up studies are indicated based on the outcomes of the core studies and can include studies of learning and memory, pharmacological challenges, neurochemistry, electrophysiological tests, and others. With chemicals for which much is already known, these guidelines allow the tests to be tailored for efficiency.
Testing of pesticides and some other environmental chemicals is guided by the EPA section 870 guidelines (U.S. EPA 1998c) or OECD guidelines (1995, 1997). For both, FOB (or modifications thereof) and motor activity are considered basic tests to include in acute and repeated dosing as well as developmental neurotoxicity studies (OECD 2007; U.S. EPA 1998b). Compared to safety pharmacology testing described above, there is less flexibility in modifications or selections of tests. Instrumental tests of learning and memory as well as the startle response are included only in the developmental neurotoxicity test guidelines (U.S. EPA 1998b).
Behavioral Tests
Screening Batteries
Screening, or first-tier testing, typically consists of simple or quick tests of behavior that may be used to identify whether a chemical acts on the nervous system, and at what dose levels, whereas second-tier testing involves more complex tests that provide a more complete description of the effects and dose-response relationships (Tilson 1993). Innate, or reflex, behaviors (e.g., locomotor activity, sensory function) provide a broad assessment of neurological function and may be evaluated at both levels of assessments. However, these behaviors vary in the degree of specificity and therefore may be difficult to interpret. Furthermore, since some behaviors are not under control of the experimenter, the data may be more variable. On the other hand, learned, or conditioned, behaviors require training of the test subject, are focused on specific aspects of behavior (e.g., short- vs. long-term memory), and are usually only utilized in second-tier studies since they are more time- and resource-intensive. Since the behavior is under procedural control, and since experimental variables may be altered to increase the specificity of the test, the data may be less variable and easier to interpret. Examples of tests used at each level are presented in Table 1 ; however, these tests and endpoints are only a sampling of the variety of potential behavioral tests.
Types of behavioral tests often used at first and second tier of neurotoxicity evaluations. The types of tests and endpoints are examples only and are not inclusive of all potential tests that could be used.
One goal of screening for neurotoxicity using an FOB protocol is to cast a wide net to detect any potential nervous system effects, especially when dealing with a chemical for which little or no such information exists. For regulatory testing, the FOB provides information on effects at low doses: such data are needed to determine the most sensitive endpoint across a range of toxicity tests including systemic, developmental, reproductive, respiratory, immune, and carcinogenicity endpoints. The U.S. EPA FOB includes numerous evaluations of motor, sensory, autonomic, and integrative neurological functions. It is important to note that there is no one single FOB protocol, but rather the guidelines describe more general aspects and experimental tests that should be included. Over the years several protocols for behavioral assessments have been published (e.g., Kulig 1996; Mattsson, Spencer, and Albee 1996; McDaniel and Moser 1993; Moscardo et al. 2007; Moser et al. 1988; O’Donoghue 1989, 1996), and each testing laboratory generally uses its own version. In brief, observations take place in the home cage and an open field arena, during which time the subject’s movements, physical appearance, and reactions to various stimuli are evaluated. Often also included are manipulations including grip strength (Meyer et al. 1979) and landing foot splay (Edwards and Parker 1977). An advantage to this type of screening is that a single animal may be repeatedly assessed to determine the onset, progression, duration, and reversibility of a neurotoxic injury. Over the years the FOB has been used across many laboratories to characterize effects of a variety of chemicals, including pesticides, organometals, solvents, industrial chemicals, water contaminants, and pharmaceuticals (Moser 2010). The results show that the tests are sensitive to chemicals acting by very different modes of action and that produce vastly different toxic syndromes.
The behaviors measured in FOB procedures may be sorted by domains of neurological functions, although there is often overlap since many tests tap more than one domain and these domains do not necessarily map to specific regions of the nervous system (McDaniel and Moser 1993; Moser 1991). Some autonomic endpoints may be assessed by observation (e.g., salivation, lacrimation, pupil size), while others (e.g., respiration, heart rate) may require specific instrumentation or telemetry. Neuromuscular ability and coordination may be measured using any number of tests ranging from evaluations of gait and posture, to palpations of muscle tone or extensor strength, to instrumental tests of hindlimb and forelimb grip strength, to monitoring of righting or proprioceptive responses. Most sensory testing available for use in first-tier screening involves either testing simple reflexes (e.g., grasping, pinna reflex) or evaluation of the motor response to a variety of sensory stimuli (e.g., auditory, nociceptive, somatosensory). Activity and/or reactivity measures may be automated (see below), or observed as part of the open-field evaluations (e.g., rearing, arousal). Other behavioral evaluations including clinical signs of tremor, convulsions, or other motor abnormalities are an important aspect of neurological testing. Comparisons of chemical effects on multiple FOB endpoints and across functional domains aid in data interpretation.
There are many measures that are common between the Irwin battery and FOB protocols, which is a reflection of the original derivation of the FOB. Irwin (1968) also described effects in terms of functional domains, with overall categories of behavioral, neurologic, and autonomic and subdomains within each. Behavioral and excitation domains are evaluated in terms of activity levels and abnormal or unusual behaviors, including tremors and convulsions. Additional endpoints of activity, reactivity, and responses are considered motor-affective domain, whereas other responses (e.g., visual placing, corneal response) fall into a sensorimotor domain. Neurologic endpoints include posture, muscle tone and strength, and other neuromotor functions; gait and equilibrium evaluations are considered separately. Autonomic measures include pupil response, salivation, lacrimation, and others. In general, this approach allows detection of effects at high doses to give a complete picture of nervous system side effects, which subsequently affects labeling and the potential patient population.
While there is still often the distinction of pharmaceutical companies using the Irwin test, and chemical testing laboratories using an FOB or later iterations thereof (often referred to as expanded clinical observations; Ross 2000; Ross, Mattsson, and Fix 1998), in reality these batteries overlap and to some extent are interchangeable. A recent survey of more than one hundred testing laboratories conducting safety pharmacology studies revealed that about 64% of respondents utilized some form of an FOB, and 75% used an Irwin protocol (Lindgren et al. 2008). Unlike the Irwin screen that is used almost exclusively with mice, modifications of the FOB approach have been developed for rats and mice (reviewed in Moser 2000; Tilson and Moser 1992), as well as nonrodent species of laboratory animals, including the rabbit (Hurley et al. 1995; Takahashi, Kakinuma, and Futagawa 1994), dog (Gad and Gad 2003), guinea pig (Hulet, McDonough, and Shih 2002), and nonhuman primate (Gauvin and Baird 2008; O’Keeffe and Lifshitz 1989).
For transgenic mouse studies, Crawley proposed a standardized behavioral test battery based on paradigms already described in the literature and included neurological and psychological measures (Crawley 1999, 2003; Crawley and Paylor 1997). An initial evaluation included neurological reflexes and motor and sensory functions. These were followed with additional, often more rigorous, behavioral tests that were geared to address specific hypotheses and to model human conditions (e.g., behaviors related to anxiety, depression, schizophrenia). Another battery proposed for use as a systematic, objective protocol for phenotype analysis of mouse behavior is termed SHIRPA (Rogers et al. 1997, 2001). The test consisted of primary (observational battery), secondary (objective measures, e.g., motor activity, analgesia), and tertiary (characterization studies of anxiety, cognition, sensory, electrophysiology) endpoints. As described for the FOB and Irwin tests, the outcomes can be separated to provide profiles of neurological function.
For all these screening batteries, clearly defined protocols are critical to good experimentation. Subjective evaluations that are specified as distinct rating scales introduce a semiquantitative aspect to the data and provide more specific information regarding their distribution across treatment groups. This is often a more sensitive approach than simply listing behaviors as “normal” or “abnormal,” in which case it is critical to have working definitions of what is “normal.” Likewise, detailed descriptions of behaviors are more informative than colloquial or nonspecific terms. Personnel conducting these tests must be “blind,” or unaware of the subject’s treatment, so as not to introduce bias, however unintentional, into the data. Training is also extremely important, including careful study of the protocols and considerable practice. Observers must be taught the basics of experimental design and good laboratory techniques, understand normal behavioral repertoires and factors that can alter behaviors, be at ease handling laboratory animals, and be certified with tests using positive controls (Slikker et al. 2005).
Motor Activity and Function
Spontaneous locomotor activity is an apical test of neuronal function, representing the peak of neural integration, which has been used for decades to evaluate effects of chemical and physical treatments (Tilson and Mitchell 1984). Measurements that are made in automated systems provide objective and quantitative data, and are required by the U.S. EPA test guidelines (U.S. EPA 1998c). There are many automated chambers commercially available, and detection systems include photocell-based, field sensing, mechanical, or electronic/video tracking (Reiter and MacPhail 1982). Furthermore, the shape and size of these chambers range from polycarbonate cages to open fields to circular alleys to figure-eight forms. While it could be assumed that the type of chamber used could influence experimental data, a comparison of chemical effects and historical control data across laboratories using several different systems showed excellent comparability (Crofton, Howard, et al. 1991). Motor activity may be considered as a stand-alone test, but for purposes of screening it is often conducted in the same animals that undergo observational and functional procedures. In addition to overall levels of activity during a test session, specific features such as habituation (decreasing activity levels during a session), spatial distribution (location within the chamber), or ontogeny of activity have been used to evaluate more subtle chemical effects. Despite its advantages as a sensitive measure of nervous system effects, changes in motor activity cannot be attributed to specific neuronal substrate.
Neuromotor function may be measured in terms of motor activity, but other aspects should also be considered, including coordination, equilibrium, and strength. Some of these tests have been validated by their ability to differentiate chemicals acting through different mechanisms or by their correlations with pathological or biochemical alterations (e.g., Harry et al. 1998; Jolicoeur et al. 1979; Šedý et al. 2008; Youssef and Santi 1997). For example, grip strength (Meyer et al. 1979) is sensitive to CNS depression (Nevins, Nash, and Beardsley 1993), spinal or peripheral pathology (Bertelli and Mira 1995; Moser et al. 1992, 2004; Nichols et al. 2005), neuromuscular junction dysfunction (Crofton, Dean, et al. 1991), and nonspecific factors (Maurissen et al. 2003). Jolicoeur et al. (1979) differentiated chemical treatments using a variety of neurobehavioral tests: motor activity, catalepsy, rigidity, landing foot splay, gait analysis, and reflexive responses. The profiles of effect were specific for the different chemical treatments, even though all produced ataxia but through different mechanisms. Quantitative assessment of gait, which requires smooth integration of both central and peripheral neurons, is sensitive to a range of chemically and physically induced alterations. A description and measurements of usable gait parameters for rats have been presented (Hruska, Kennedy, and Silbergeld 1979).
Many of these motor function tests, for example, landing foot splay and grip strength, are incorporated at a screening level in an FOB and/or the Irwin battery. Another widely used motor test is the rotarod, which is an automated test that evaluates the subject’s ability to maintain balance on a rod or barrel that is rotating at either a fixed or gradually increasing speed (Dunham and Miya 1957; Jones and Roberts 1968). It has been widely used in both rats and mice to assess effects of drugs and neurooxicants (Gerald and Gupta 1977; Gilbert and Maurissen 1982) and brain injury (Hamm et al. 1994), or as part of behavioral profiling of transgenic mice (Crawley 1999). This test is deceptively simplistic, however, and may be confounded by subjects who refuse or are unable to perform that task even under control conditions. In addition, there is a strong learning component with repeated testing, which can confound the data.
Additional Behavioral Tests
Tests of learning and memory, which measure a change in behavior as a result of experience, may be considered as follow-up to positive findings in initial tests (ICH 2000), or as a required element in the U.S. EPA test guidelines for developmental neurotoxicity (U.S. EPA 1998b). Cognitive evaluations have been an integral part of psychopharmacology and psychological research, and there are almost limitless procedures that have been developed. Such paradigms include spatial or positional navigation, simple or complex conditioned responses, and operant training of positively or negatively reinforced behaviors. The procedure used informs the type of cognition that is studied; for example, learning may be evaluated with repeated training trials to measure acquisition, and memory may be repeatedly assessed across time to evaluate retention. It is important to realize that cognition is only inferred in animal models. Changes in motor or sensory function, or motivation, for example, may impact performance and must be taken into account in data interpretation (Cahill, McGaugh, and Weinberger 2001). While cognitive tests can be exquisitely sensitive and detect subtle treatment effects, comparisons between some tests and neurobehavioral screening outcomes have not supported the opinion that they are always more sensitive (Moser and MacPhail 1990; Moser et al. 2000).
Despite the wide options, there are relatively few cognitive procedures currently used in regulatory testing (Lochry, Johnson, and Weir 1994; Peele and Vincent 1989). The training required for assessing cognition ranges from several trials within a single day to daily testing for months. Since food or water restriction takes additional time, many of the current procedures are either water-based or shock-motivated. A popular test is simple conditioned avoidance of shock, for example, passive avoidance, which may require one or only a few trials to develop the association. Water mazes are tests of spatial or positional navigation, and include mazes, for example, Y, Biel, or Cincinnati, that require the subject to learn which way to turn at each decision point (e.g., Biel 1940; Vorhees 1987, 1997). A different type of water maze is that originally described by Morris (1981), in which the subject is required to use extramaze cues to learn the location of a hidden platform. A survey of laboratories conducting safety pharmacology tests indicated that most laboratories included cognitive testing and that simple water maze and passive avoidance tasks were the most commonly used tasks (Lochry, Johnson, and Weir 1994). Since much of the current cognitive testing is conducted as part of developmental neurotoxicity assessments, it is important to realize potential differences in testing young (weanling) rodents in many of these tasks as compared to adults (Ehman and Moser 2006).
Much of what is known about the underlying neural substrates for cognitive behavior has come from pharmacological and lesion studies. Hippocampal function, for example, is considered a major influence in spatial performance, as well as alternation learning (D’Hooge and DeDeyn 2001; Lalonde 2002; O’Keefe and Dostrovsky 1971); however, much of this is influenced by complex interactions amongst limbic system components and should not be simplistically interpreted (Calton and Taube 2009). Development of cognitive function tracks that of the cholinergic system, although there is an integrated influence of several neurotransmitter systems (Ehman and Moser 2006). A recent meta-analysis of reports of four common tasks (water maze, radial-arm maze, passive avoidance, spontaneous alternation) demonstrated a lack of direct correlations between changes in task performance and interference of specific neurochemical systems, and indeed the data indicated considerable interactions between them (Myhrer 2003).
The automated startle response is an extension of the sensorimotor responses measured at the screening level. Instead of crude stimuli, there is control over specific parameters of the stimulus. Most commonly used is the acoustic response, which typically measures the force of the motor response following a suprathreshold auditory stimulus. Air puff, visual, and somatosensory stimuli may also be used. Latency and magnitude are the dependent variables, and trials range from a few to many. The response typically habituates with repeated stimulus exposure. The neural circuitry of this relatively simple response has been demonstrated, being one of the few behaviors for which the physiological basis is known (Davis et al. 1982; Koch 1999; Yeomans et al. 2002). The proposed pathway contains few synapses, consisting of the auditory nerve, ventral cochlear nucleus, nuclei of the lateral lemniscus, nucleus reticularis pontis caudalis, and spinal motor neurons (Davis et al. 1982). Variations of the test include prepulse inhibition or reflex modification, which evaluates the attenuation of the startle response that is a function of the presentation of subthreshold stimuli. In the sense that this attenuation represents gating and processing of the stimulus, this procedure has been considered an animal model of schizophrenia (Light and Braff 1999). By varying the intensity of the prestimulus tone, auditory threshold can also be established, which has been useful in detecting, for example, midfrequency hearing loss due to solvent exposure (Crofton 1990).
Scientists as well as the general public have realized the importance of protecting the developing fetus and child from adverse outcomes of chemical exposures. Over the past several decades, expert panels have repeatedly supported evaluation of effects on the developing nervous system to fully understand the toxicological profile of chemicals (e.g., National Academy of Sciences 1993). In response to this situation, the U.S. EPA and OECD have developed standardized test guidelines to evaluate effects of chemical exposure during gestation and lactation (OECD 2007; U.S. EPA 1998b). The behaviors that are measured at weaning and in adulthood include (but are not limited to) motor activity, reflexes and responses, learning and memory, and automated startle response; and in addition neuropathology and morphometrics are included. Similar tests are suggested by the U.S. Food and Drug Administration (U.S. FDA; 2006) for testing of drugs meant for pediatric populations. While many of the behaviors used are the same as those described above and often used in adult rats, additional features in developmental studies include evaluating the ontogeny of these behaviors or whether there are persistent behavioral changes lasting into adulthood.
Neuropathological Alterations
Cerebellar
The cerebellum plays a major role in motor integration and coordination, and thus cerebellar dysfunction often presents as ataxia and motor incoordination (Morton and Bastian 2004). Loss of cerebellar granule cells in cystatin B-deficient mice displays as ataxia and myoclonic seizures (Pennacchio et al. 1998). Cerebellar lesions result in tremor; abnormal posture; and neuromotor inabilities, for example, failed balance beam performance; interestingly, many of these functions show recovery of function in the days after surgically induced cerebellar lesions (Modianos and Pfaff 1976). In addition to motor aspects of behavior, intact cerebellum function is necessary for cognition, including tests of spatial learning and classical conditioning (Lalonde and Strazielle 2003; Marien, Engelborghs, and DeDeyn 2001; Stanton 2000). A mutant mouse model (“Lurcher” mouse), which displays loss of Purkinje cells, olivary neurons, and granule cells, shows alterations in cognitive function (exploration and spatial learning) and emotion, as well as motor skills (Vogel et al. 2007).
3-Acetyl pyridine (3AP) is a toxicant that destroys the inferior olivary nuclei with a single dose (Balaban 1985; Desclin and Escubi 1974) and has been used as a model for cerebellar ataxia (Butterworth et al. 1978; Jolicoeur et al. 1979). In a comparative study, neuromuscular toxicity was detected using several behavioral tests: the FOB, motor activity, and schedule-controlled operant responding (Moser and MacPhail 1990). 3AP produced ataxia, altered equilibrium, decreased grip strength, lower motor activity, and slower operant responding. The FOB measures, particularly gait changes, were the most sensitive in terms of effective doses. Furthermore, recovery of motor activity and operant responding was evident over several weeks, but through observations it was clear that the rats continued to show neuromuscular changes. A similar profile of 3AP effects on quantitative aspects of gait, motor activity, foot splay, and several reflexes was reported by Jolicoeur et al. (1979), with recovery only on the motor activity test.
Hippocampal
The hippocampus is most implicated in development and maintenance of spatial learning and memory, as well as exploratory behavior (D’Hooge and DeDeyn 2001; Lalonde 2002; O’Keefe and Dostrovsky 1971). There are, however, complex interactions amongst the other components of the limbic system (Calton and Taube 2009). Cognitive tests often used to assess hippocampal function include spontaneous alternation, fear conditioning, Morris water maze, and radial-arm maze (Gerlai 2001). Transgenic mice overexpressing hippocampal interleukin-1beta show cognitive impairment in spatial, but not other, components of a Morris water maze task (Moore et al. 2009).
Trimethyl tin (TMT) produces damage in the hippocampus, pyriform cortex, amygdala, and neocortex (Aldridge et al. 1981; Brown et al. 1979; Koczyk 1996). Behavioral sequelae include limbic dysfunction, hyperactivity, and impaired learning (Koczyk 1996; McMillan and Wenger 1985; Perretta, Righi, and Gozzo 1993; Reiter and Ruppert 1984). Evaluations using the FOB and motor activity produced a similar profile of effects, including increased activity levels and reactivity that lasted throughout testing (up to 42 d), and neuromuscular endpoints (e.g., gait, righting) that showed functional recovery over a shorter period of time (Moser 1996). In another study, rats were evaluated with an FOB, motor activity, Cincinnati water maze, passive avoidance, and acoustic startle (reported in Moser et al. 2000). Effects on measures of the FOB were most sensitive and at the lowest dose included changes in muscle tone and equilibrium. The intermediate dose affected the acoustic startle and performance in the water maze. Changes in motor activity and passive avoidance occurred only at the highest dose, and this was also the only dose that produced observable neuropathological lesions and some lethality.
Nigrostriatal
The corpus striatum area of basal ganglia is also involved in motor function, and the loss of striatal dopaminergic neuronal function produces Parkinsonism in humans. Animal models of this disease have been developed using chemicals such as 6-hydroxydopamine (6-OHDA) or manganese, both of which deplete dopamine. In one study, FOB endpoints detected impairment produced by manganese exposure but not unilateral intrastriatal injection of 6-OHDA, whereas a combination of the treatments exacerbated the manganese-induced functional effects (Witholt, Gwiazda, and Smith 2000). On the other hand, bilateral 6-OHDA administration caused severe disruption of the use of limbs, muscle tone, and righting ability (Marshall, Richardson, and Teitelbaum 1974). Mouse models of nigrostriatal neuronal loss (“weaver mice”) showed considerable motor dysfunction (Triarhou, Norton, and Hingtgen 1995), and various mechanisms resulting in injury to the nigrostriatal system produce models of Parkinsonism that can be described with various behavioral tasks (Meredith and Kang 2006).
Axonal
Axonopathies (sensory and motor) of both the CNS (e.g., spinal axons) and PNS produce a clinical condition wherein the longest and most distal processes are affected first (e.g., “stocking and glove” distribution), becoming more progressive to the proximal sites. A variety of chemicals produce a pattern of peripheral neuropathy in laboratory animals and humans, although the relative sensitivity of motor and sensory axons impacts their specific profiles. Behavioral changes are usually observed as neuromotor (primarily) and sensory dysfunction (Spencer and Schaumburg 1984; Sterman 1984).
In a study of carbon disulfide inhalation in rats, comparisons of functional, biochemical, electrophysiological, and neuropathological endpoints illustrated their relative sensitivity, time course, and dose response (Harry et al. 1988; Moser et al. 1998). Gait changes were evident early on and became more severe and occurred at lower concentrations as exposures continued up to thirteen weeks. The hindlimbs were preferentially affected in motor function tests (grip strength, limb placement); and higher concentrations produced more generalized signs of ataxia, tremors, and changes in reactivity. Neuropathological changes correlated with decreased nerve conduction velocity (Herr et al. 1998), but both measures were less sensitive than behavior. Comparisons of behavioral effects and neuropathology in carbon disulfide-exposed mice presented the same conclusions (Sills et al. 2000). A similar profile is presented by 2,5-hexanedione, the toxic metabolite of n-hexane, and indeed the toxic actions of these chemicals are very similar (Sterman and Sheppard 1982).
Acrylamide is a well-studied neurotoxicant that produces peripheral motor as well as sensory alterations. Its effects using behavioral endpoints have often been used for positive control and demonstrative purposes and include altered gait and righting, decreased hindlimb grip strength, increased landing foot splay, and decreased rearing (e.g., Broxup et al. 1989; Jolicoeur et al. 1979; Moser et al. 1992; Schulze and Boysen 1991; Youssef and Santi 1997). Gait abnormalities appear especially sensitive to acrylamide and are evident at lower doses and/or earlier during continued exposure as compared to other neurological indices (LoPachin et al. 2002; Moser et al. 1992). The profile of acrylamide compared to carbon disulfide and/or 2,5-hexanedione differs in specific ways, with landing foot splay showing differential effects (LoPachin et al. 2002; Shell et al. 1992; Sterman 1984; Sterman and Sheppard 1983). The neuropathy produced by acrylamide was understood for several decades as a dying back of long and large-diameter peripheral nerve fibers (Miller and Spencer 1985), but this understanding has been modified by recent work showing central involvement and effects on nerve terminals through altering neurotransmitter release (Lehning et al. 2002; LoPachin 2005).
Interpretation and Correlations
Behavior is a gross measure of the integration of neural function, which may be evaluated at many different levels and with a variety of tests. There are multiple cellular targets at which chemicals may have effects, ranging from specific ion channels or neurotransmitters to generalized electrical stimulation. In some sense, the distinction between “pharmacological” and “toxicological” is artificial, being considered by some to be short-term versus lasting effects, or behavioral versus structural changes. For interpretation of behavioral effects, the U.S. EPA considers any change to be “adverse,” regardless of time course, mode of action, or clinical or pathological correlates (U.S. EPA 1998a). Partial support for this sentiment is the similarity of nervous system function across species and good concordance between human symptomatology and behavioral signs in rodents (Moser 1990).
Effects detected on a single or few endpoints are informative but not definitive, and an understanding of the tests and their meanings is important for interpretation. While data from a large number of endpoints for each subject may be difficult to understand, having many measures provides a multidimensional approach that can be useful for the interpretation of effects (Irwin 1962; Weiss and Elsner 1996). When evaluating changes in specific endpoints, it is important to have a knowledge of the underlying variability, specificity, and influences on that measure. For many behavioral measures, nonmonotonic or inverted U-shaped curves are not uncommon, often due to feedback regulation of the nervous system. Effects detected in the lowest doses, or early during prolonged exposures, may indicate specific nervous system actions. It is important to note that at higher doses, effects may be more generalized and not necessarily due to a direct action on the nervous system. Some endpoints are more specific than others for the degree to which they reflect nervous system action. A retrospective analysis of fifty non-CNS active agents (Redfern et al. 2005) showed that body weight changes were the most common outcome in FOB studies, which is probably not surprising given the multiple systems that impact food and water intake, absorption, and metabolism. Body temperature was also frequently altered. In contrast, dose-related effects on endpoints that are more behaviorally based were observed much less frequently (0%-14% incidence), and neuromuscular (e.g., gait changes) or motor abnormalities (e.g., tremor, convulsions) were never observed. It is important to note that blood-brain permeability was the authors' criteria of “CNS active,” neglecting the influence of peripheral or non-neuronal alterations on the conclusions.
Correlative changes in behavioral and pathological outcomes should not always be expected. One possible outcome is that behavioral changes will be observed with no evidence of pathological abnormality. Potential explanations are numerous and may include effects on nerve cell communication, for example, changes in receptors and/or neurotransmitter release; biochemical changes are often not reflected in histopathological evaluations. These effects may be the result of acute or prolonged exposures. Timing of assessment is critical, since it has been demonstrated that behavior may be altered in advance of measurable structural changes in the system (for example, LoPachin et al. 2002; Moser et al. 1992).
There are also instances wherein structural lesions are observed in animals with no apparent behavioral effects. The nervous system shows functional plasticity, residual capacity, and compensatory mechanisms that may occur in the face of permanent pathological damage. As described above, rats dosed with 3AP showed recovery of operant responding and motor activity even while gait abnormalities were still observable. Since the cerebellar lesions induced by 3AP are irreversible, behavior-pathology correlations could be erroneous depending on which behavior was being assessed. Likewise, several behavioral tests have been used successfully to quantitatively assess peripheral nerve damage (nerve crush) and subsequent regeneration, seen as motor recovery (Bertelli and Mira 1995; Nichols et al. 2005; Varejão et al. 2001). Species differences are demonstrated with the delayed neuropathy observed with some organophosphorus chemicals, in that behavioral effects (ataxia) are easily observable in hens but not in rats; however, pathological changes in long axons can be seen with both species (Dyer et al. 1992; Padilla and Veronesi 1988). Finally, tolerance to some behavioral effects may develop with continued exposures, possibly due to kinetic factors or dynamic changes in the system.
Summary
Behavior represents the integrated sum of activities mediated by the nervous system, and functional tests used for neurotoxicity testing tap different behavioral repertoires. These tests have an advantage over pathologic measures in that they permit repeated evaluation of a single animal over time to determine the onset, progression, duration, and reversibility of a neurotoxic injury. Functional assays range from a screening-level battery of tests to refined procedures to tap specific forms of learning and/or memory. Examples of chemical-specific neurobehavioral-pathological correlations are available in the literature; however, temporal and dose-response characteristics between the two cannot and should not be obligatory. Further basic research on these relationships will improve understanding of the underlying neuronal substrates of behavior.
