Abstract
Over the past 30 years, behavioral, computational, and neuroscientific investigations have yielded fresh insights into how pigeons adapt to the diverse complexities of their visual world. A prime area of interest has been how pigeons categorize the innumerable individual stimuli they encounter. Most studies involve either photorealistic representations of actual objects thus affording the virtue of being naturalistic, or highly artificial stimuli thus affording the virtue of being experimentally manipulable. Together those studies have revealed the pigeon to be a prodigious classifier of both naturalistic and artificial visual stimuli. In each case, new computational models suggest that elementary associative learning lies at the root of the pigeon’s category learning and generalization. In addition, ongoing computational and neuroscientific investigations suggest how naturalistic and artificial stimuli may be processed along the pigeon’s visual pathway. Given the pigeon’s availability and affordability, there are compelling reasons for this animal model to gain increasing prominence in contemporary neuroscientific research.
Introduction
Could the pigeon emerge as a leading model animal for the neuroscientific study of complex visual processing? When this question was posed over 30 years ago, 1 promising signs suggested that the pecking pigeon might indeed develop into an extremely useful model system. Since then, numerous behavioral, computational, and neuroscientific advances have yielded fresh insights into how categories are formed in the pigeon’s brain and how they help this highly mobile animal navigate the sundry intricacies of its visual world. We believe that researchers should now take special note of these advances and pay far greater attention to this increasingly important model of complex visual processing.
Categorizing Naturalistic Stimuli
The first step in developing the pigeon model was to devise effective behavioral paradigms for experimentally investigating visual categorization. Building on the pioneering research of Richard Herrnstein, 2 innovative techniques have now been successfully deployed to concurrently teach pigeons to discriminate as many as 16 different object categories, such as color photographs of cars, dogs, bottles, babies, keys, and trees. 3 Also revealed in this research have been critical functional relations: (a) with increasingly diverse stimuli in each category, acquisition is slower, but accuracy to novel transfer stimuli is superior, (b) category learning is robust even when stimuli are never repeated within or between sessions of training, although repetition facilitates acquisition, and (c) training with stimuli arbitrarily grouped into “pseudocategories” severely slows the learning process compared to training with the same stimuli grouped into perceptually coherent categories.
The second step in developing the pigeon model was to see whether and how these and many other empirical findings could be embraced by a computational model. Although the use of complex photographic stimuli has yielded strong category learning by pigeons, such naturalistic stimuli can be rather difficult to manipulate experimentally and to represent in formal models of behavior and cognition.
One way to solve the problem of stimulus representation is to adopt a common-elements approach. This approach—combined with an error-driven learning rule—is now known as the “common-elements” model, 4 and has been highly effective in explaining a broad range of empirical findings in pigeons’ categorization of naturalistic images.
To appreciate the “common-elements” model, first consider how it represents complex visual stimuli and the relation among them. As shown in Figure 1, each of 3 stimuli is represented by a set of elements. Different stimuli may share elements; those shared elements are termed “category-specific” (and are colored light blue). On the other hand, those elements that are active only in the presence of individual stimuli, but no others are termed “stimulus-specific” (and are colored dark blue, red, and yellow). Second, to cope with the dynamic interplay between behavioral control exerted by the category-specific and stimulus-specific elements during category learning and subsequent generalization testing, the common-elements model incorporated an error-driven learning algorithm.

Representation of stimulus-specific and category-specific elements for a trio of naturalistic photographic stimuli. See text for details.
Beyond its established power to embrace the basic regularities of acquisition and generalization, the common-elements model is proving to be extremely valuable in elucidating the neural substrates of pigeon category learning. 5 The model nicely captures the hierarchical processing of visual stimuli along the pigeon’s tectofugal pathway. 6 In addition, dopamine-mediated feedback within cell assemblies of the nidopallium caudolaterale fittingly mediates the pigeon’s learning to perform the duly reinforced categorization response. 7 Here, it is noteworthy that categorization is usually not detectable in the firing patterns of individual neurons, but requires a neural network. Indeed, just 33 neurons in a pigeon visual associative forebrain area is sufficient to recognize the category “human,” whereas single neurons in this network are unable to do so. 8
Categorizing Artificial Stimuli
As mentioned above, categorization studies involving naturalistic stimuli—such as color photographs of actual objects—provide the results of these investigations at least the veneer of verisimilitude. Still, the decided advantage of experimental control inheres to categorization studies involving artificial stimuli, which have also been utilized in recent research.
In particular, Gabor images—circular stimuli comprising parallel black-and-white lines varying in width and tilt—are often used in human and animal categorization studies. Because the line width and tilt dimensions of these artificial stimuli can be independently manipulated, cleverly crafted tasks can be devised to assess organisms’ possible selective attention to and behavioral control by these dimensions. These specific categorization tasks have not only proven to be practical, but they have had considerable theoretical significance for several reasons. 9 First, people learn tasks that align the prevailing category boundaries with either of these 2 visual dimensions far faster than otherwise identical tasks that do not. Second, pigeons do not exhibit this marked disparity in learning speed. Third, these results have been interpreted as reflecting people’s preferential deployment of selective attention and rule-based learning mechanisms which pigeons may not have. Specifically, people may possess a highly elaborated prefrontal cortex that is capable of mediating these attentional and cognitive processes, whereas pigeons possess a less elaborated nidopallium caudolaterale that may be incapable of doing so.
Rather than concentrating on those neural and cognitive mechanisms that pigeons may lack, we pursued the possibility that pigeons’ otherwise robust categorization prowess might reflect their deploying a particularly powerful associative mechanism, as we earlier reported to be the case for their categorizing naturalistic stimuli. To experimentally explore this possibility, we developed a second computational model that simply credited pigeons with the ability to associate individual Gabor stimuli with 2 distinctive categorization responses. 10 As illustrated in Figure 2, a 3-layer model was created. Each correct categorization response strengthened pigeons’ tendency to repeat that response (the darkened red region signifying excitation of the Category A response) and weakened their tendency to make the alternative, incorrect response (the brightened white region signifying inhibition of the Category B response) to that same stimulus. To enable the model to generalize responding to novel stimuli, we allowed learning to spread along both line tilt and width dimensions to a limited degree.

Depiction of a Gabor stimulus from the inner training ring (Category A) and its stages of processing by our computational network model (A), and its associative representation in the 2 rival stimulus spaces for the concentric-rings task (B, C, and D). See text for details.
Finally, we created 2 new categorization tasks that should not have advantaged selective attention or rule-learning to achieve task mastery. Each task involved training categories lying within 2 circular rings created by choosing Gabor stimuli from the complete line width-line tilt space depicted in Figure 3.

Depiction of the full Gabor stimulus distributions in 2 training categories plus our computational network model’s corresponding associative representations for the concentric-rings task (A–D) and for 2 different variants of the sectioned-rings task (E–H). See text for details.
The first category structure we studied was the “concentric-rings” task 11 illustrated in the top portion of Figure 3. For rule-based learners to acquire this task: (a) they would have to forsake selectively attending to either line width or line tilt; (b) they would have to attend to both dimensions; and (c) they would have to apply a bimodal decision rule along those dimensions, thereby making this a very challenging category structure to master. However, because pure associative learners would not need to engage in such complex neural computations, they should quite readily acquire the task.
Our model illustrates how a purely associative learning mechanism can readily learn the concentric rings task. Figures 3A and B show the associative representations acquired by the model at the beginning of training, where 2 stimuli from each category were given: Category A outlined in red and Category B outlined in blue (Figure 3A). In each of the 2 panels of Figure 3B, the boundaries of the 2 different category distributions are outlined in black; 2 associations should arise to the correct category (shown by the brighter red and brighter blue regions) and 2 associations should arise to the incorrect category (shown by the bright white regions). Also note that the spread of association can extend beyond the specific values of each training stimulus and into nearby regions of the category space.
Figure 3C represents the complete stimulus spaces of Categories A and B that would result from an infinitely long series of trial-unique and randomly presented Gabor stimuli, and Figure 3D represents the predicted associations of the model at the end of training. Figure 3D shows that the model appropriately learns to associate distinctive regions of the stimulus space with the correct category responses: strongly favoring responding “Category A” to stimuli in the inner ring and strongly favoring responding “Category B” to stimuli in the outer ring.
Pigeons behaved just as the model predicted, with accuracy levels surpassing a remarkable 85% to each category by the end of training. Furthermore, because of the generalization of incorrect responses from one training ring to the other, choice accuracy to stimuli in the inner training ring was actually lower than to novel testing stimuli lying within the center of the stimulus space, and choice accuracy to stimuli in the outer training ring was actually lower than to novel testing stimuli lying beyond the outer training ring—an otherwise surprising pattern of responding, but a clear hallmark of associative learning called “peak shift.”
The second task we studied involved stimuli which were similar to the concentric-rings task, but which cut each training ring into 4 sections. 12 Figures 3E and G depict the complete stimulus distributions possible in the “+ cut” and the “× cut” variants of the “sectioned-rings” task, respectively, where Category A is shown in red and Category B is shown in blue. Like the concentric-rings task, this task thwarts the efforts of a learner to use a simple, unidimensional rule. Furthermore, the sectioned-rings task eliminates the possibility of a learner deploying a complex rule that would divide the stimulus space into an “interior” category and an “other” category. Rule formation in this sectioned-rings task should therefore be virtually impossible, yet a pure associative learner ought to acquire the task if provided sufficient training.
To see why, Figure 3F illustrates the model’s representation of the predicted associations between the stimulus space and Category A (left) and Category B (right) responses at the end of training for the + cut variant. Brighter colors represent regions associated with stronger associations, whereas whiter colors represent regions associated with weaker associations. In both panels, the sections that are correct for each category are outlined in black. Figure 3H illustrates the model’s representation of the predicted associations between the stimulus space and Category A (left) and Category B (right) responses at the end of training for the × cut variant. Despite the extreme complexity of these 2 category structures, a pure associative learner ought to acquire each of these task variants.
The pigeons did indeed learn both versions of the sectioned-rings task, finally achieving an average score of 68% correct. In fact, the pigeons’ accuracy averaged 73% correct in the middle of the ring sections, but just 63% correct near the boundaries of the adjacent rings. This disparity in accuracy is just what the associative model predicts due to stimulus generalization of incorrect category responses where 2 training sections adjoin one another.
Synopsis and Prospectus
The striking success of simple computational models in embracing the results of categorization studies involving both naturalistic and artificial stimuli underscores the power of associative learning—a process that is often underestimated in cognitive science. Perhaps it is no accident that this learning process has seeded the strong success that is so often celebrated in contemporary artificial intelligence.
Notwithstanding that important observation, the question arises: How might we understand the small modifications in computational modeling we made to accommodate the different visual stimuli used in our studies with naturalistic and artificial stimuli? One possibility is that the greater simplicity of Gabor stimuli than photographic stimuli allows the former images to be processed earlier in the pigeon’s tectofugal pathway than the latter stimuli.4,6 Because such hierarchical processing might not prove to be correct, it certainly represents an important avenue to pursue with contemporary neuroscientific techniques. 13
These techniques now include fMRI, 14 thereby opening the door to visualizing the neural bases of perception and cognition in pigeons. Such investigation would be a major step forward in elucidating the evolution of neurocomputation. 15
Given the pigeon’s availability and affordability, there are several compelling reasons for this animal model to gain increasing prominence in contemporary neuroscientific research. We look forward to the fresh insights that the results of this research will reveal.
Footnotes
Author Contributions
All authors contributed to the planning, writing, and editing of this manuscript.
Declaration of conflicting interests:
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding:
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by grant P01HD080679 from the National Institutes of Health (EAW), a CAREER award from the National Science Foundation (BMT), and AVIAN MIND ERC-2020-ADG LS5 GA No. 101021354 (OG).
