Abstract
Many philosophers use findings about sensory substitution devices in the grand debate about how we should individuate the senses. The big question is this: Is “vision” assisted by (tactile) sensory substitution really vision? Or is it tactile perception? Or some sui generis novel form of perception? My claim is that sensory substitution assisted “vision” is neither vision nor tactile perception, because it is not perception at all. It is mental imagery: visual mental imagery triggered by tactile sensory stimulation. But it is a special form of mental imagery that is triggered by corresponding sensory stimulation in a different sense modality, which I call “multimodal mental imagery.”
Blind subjects can be taught to navigate their environment in some sense “visually” by having a camera installed on their body the images of which are fed into some other sense modality of the subject. The camera is recording images continuously, and these images are transmitted to the subject in real time in the tactile sense modality, for example (it can also be done auditorily, see Meijer, 1992). So the images are imprinted on the subject’s skin with slight pricks as soon as they are recorded (see Bach-y-Rita & Kercel, 2003; Bach-y-Rita et al., 1969). A lot of research has been done about this phenomenon in the last four decades (Amedi et al., 2007; Auvray, Hanneton, & O’Regan, 2007; Auvray, Hanneton, Lenay, & O’Regan, 2005; Auvray et al., 2005; Meijer, 1992; Sampaio, Maris, & Bach-y-Rita, 2001; Tyler et al., 2003; Deroy and Auvrey 2012, 2014, Ward & Meijer, 2010 for summaries and Chirimuuta & Paterson, 2015 for a historical overview of the sensory substitution research).
The surprising results were that the subjects eventually experienced the scene in front of them “visually”—they talked about visual occlusion, for example, and they were very competent at navigating relatively complex terrains. They “spontaneously report the external localization of stimuli in that sensory information seems to come from in front of the camera, rather than from the vibrotactors on their back” (Bach-y-Rita et al., 1969, p. 964).
Philosophers were quick to jump on these findings for philosophical ammunition in the grand debate about how we should individuate the senses (see, e.g., Farina, 2013; Gray, 2011; Heil, 1983, 2011; Hurley & Noë, 2003; Morgan, 1977; Peacocke, 1983, but see also Block, 2003). The big question was: Is “vision” assisted by sensory substitution really vision? Or is it tactile perception? Some of the classic ways of individuating the senses (Grice, 1962; Keeley, 2002; Nudds, 2004, 2011) come apart in this odd case: If we individuate the senses according to the sensory stimulation, then sensory substitution assisted “vision” would count as tactile perception. If we individuate the senses according to phenomenology, then it seems to be vision. 1
My claim is that sensory substitution assisted “vision” is neither vision nor tactile perception, because it is not perception at all. It is mental imagery—multimodal mental imagery. It is visual mental imagery triggered by tactile sensory stimulation.
What Is Mental Imagery?
Mental imagery is one of those mental phenomena that philosophers do not seem to feel obliged to define because they rely on everyone knowing what it is. This is extremely problematic, partly because the most salient and straightforward way of utilizing mental imagery, which pops up invariably as the stereotypical example of mental imagery, is not, as we shall see, particularly representative.
Here is the example that is widely used to introduce what mental imagery is supposed to be. Close your eyes and visualize an apple. This is one way of exercising mental imagery and one that many philosophers and non-philosophers consider the standard and stereotypical way of having mental imagery (Richardson 1969, Currie 1995, Kind 2001). And it is indeed mental imagery. But I think it is atypical in at least four respects. 2
First, it is visual mental imagery. And vision is not the only sense modality. So if we can perceive auditorily, olfactorily, and so on, we can also have auditory, olfactory, tactile, mental imagery. I call all these “mental imagery”—it should be clear that the word “imagery” does not here denote anything that has to do with images (which would usually be something visual): Mental imagery exists in all sense modalities.
Second, visualizing the apple is something you do voluntarily and intentionally. But mental imagery does not have to be voluntary or intentional. One can have flashbacks of some unpleasant scene—this is also mental imagery, but it is not a voluntary or intentional exercise of mental imagery. And some of our mental imagery is of this involuntary and unintentional kind—this is especially clear in the auditory sense modality, as demonstrated by the phenomenon of earworms: tunes that pop into our heads and that we keep on having auditory imagery of, even though we do not want to. Further, if mental imagery is a necessary feature of episodic memory (Byrne, Becker, & Burgess, 2007, see also Berryhill et al., 2007 overview), then it is also involuntary inasmuch as episodic memory can also be involuntary.
Third, when you visualize the apple, you tend to do so in an abstract visualized space: You close your eyes and visualize an apple in this abstract space that has nothing to do with the space you occupy. But this is not necessarily so. One can also visualize the apple in one’s egocentric space, for example, in one’s hand or next to one’s laptop. Mental imagery can localize the imagined object in one’s egocentric space or in some abstract space. In fact, having mental imagery of something in our egocentric space is not something unusual—we use mental imagery this way very often. When you are looking at your empty living room, thinking about what kind of furniture to buy, you are likely to try to form mental imagery of, say, a sofa not in an abstract space “in the mind’s eye,” but in your living room. And when you are trying to figure out whether this sofa would fit through the main entrance, again, you are having mental imagery of the sofa in the very concrete space of the main entrance of your house.
Fourth, visualizing an apple is not normally accompanied by any feeling of presence. You are not fooled by this mental imagery into thinking that there is actually an apple in front of you so that you could reach out and grab it. But, again, this is not a necessary feature of mental imagery. There is no prima facie reason why mental imagery could not be accompanied by the feeling of presence. In fact, lucid dreaming, which is widely considered to be a form of mental imagery (see Hobbes, 1654; Ichikawa, 2009; Walton, 1990 for a summary), is very much accompanied by the feeling of presence. And hallucination, which is, arguably, also a form of mental imagery (see Allen, 2015; Nanay, 2016a; 2016b) is also clearly accompanied by the feeling of presence.
These four distinctions are orthogonal to one another, so we get a lot of internal distinctions within the category of mental imagery. Mental imagery can be voluntary, non-egocentric, and not accompanied by the feeling of presence. Visualizing an apple is of this kind. But it can also be involuntary, egocentric, and accompanied by the feeling of presence (which would be the polar opposite of the kind of mental imagery that we have when we close our eyes and visualize the apple). This latter kind of imagery is what will play a crucial role in understanding sensory substitution.
So far, I broadened the concept of mental imagery, but I have not said what I take to be mental imagery. I take mental imagery to be perceptual processing that is not triggered by corresponding sensory stimulation in a given sense modality. Two crucial concepts in this definition, of perceptual processing and correspondence, need to be clarified.
By perceptual processing, I simply mean processing in the perceptual system. Some of this processing is triggered by corresponding sensory stimulation—this amounts to perception. And some is not triggered by corresponding sensory stimulation—this amounts to mental imagery. Perceptual processing is just processing in the perceptual system (e.g., in early cortical areas, see Bullier, 2004; Grill-Spector & Malach, 2004; Katzner & Weigelt, 2013; Van Essen, 2004). 3 This happens when we perceive. But it also happens when we have mental imagery.
The concept of “corresponding,” in contrast, is more difficult to spell out. Mental imagery can happen even when there is sensory stimulation in the given sense modality, but the correspondence is missing. But what is this correspondence relation supposed to be? The sensory stimulation is a fairly straightforward event: light hitting my retina in a certain pattern. But what is this pattern supposed to correspond to (or fail to correspond to)? And here my answer is the patterns in early cortical perceptual processing. So, in the visual sense modality, this would be the retinotopic primary visual cortex. The primary visual cortex (and also many other parts of the visual cortex see Grill-Spector & Malach, 2004 for a summary) is organized in a way that is very similar to the retina—it is retinotopic. So we can assess in a simple and straightforward manner whether the retinotopic perceptual processing in the primary visual cortex corresponds to the activations of the retinal cells. In the case of mental imagery, we get no such correspondence: The mental imagery is a retinotopic representation, but this retinotopic representation fails to correspond to what is on the retina. While this retinotopy of the early visual cortices (and their equivalent in the other sense modalities, see, e.g., Talavage et al., 2004) is an extremely convenient way of gaining evidence about the correspondence or lack thereof of sensory stimulation and perceptual processing, this is just one way in which the two can correspond. There are others. In other words, the correspondence between sensory stimulation and perceptual processing does not have to be retinotopic. 4 Another kind of correspondence that can play an important role here is temporal correspondence (again, something easy enough to measure)—whether the activation of the early cortices follows the sensory stimulation quickly enough.
A couple of attractive features of this definition need to be pointed out. First of all, according to this definition, mental imagery does not have anything to do with the kind of tiny images in our mind that Gilbert Ryle was making fun of (Ryle, 1949). Mental imagery is not something we see: It is a certain kind of perceptual processing. So it is in no ways more mysterious than other kinds of perceptual processing (like perception proper). Nor do we need to postulate any ontologically extravagant entities (like tiny pictures in our head) to talk about mental imagery any more than we need to postulate these entities in order to talk about perception.
Further, this definition is neutral about the format of mental imagery. In the “Imagery Debate” of the 1980s (see Tye, 1991 for a good overview), the main issue was whether mental imagery is depictive or symbolic/propositional (see Kosslyn, 1980; Kosslyn et al., 2006; Pylyshyn, 1981, 2002, respectively). It is somewhat unfortunate that this question about format monopolized the psychological and philosophical discussion of mental imagery (see Pearson & Kosslyn, 2015 for an overview), but what is crucial at this point is to point out that my definition of mental imagery is consistent with both the imagistic and the symbolic/propositional way of thinking about mental imagery.
The definition of mental imagery as perceptual processing that is not triggered by corresponding sensory stimulation in the relevant sense modality is widely accepted in neuroscience and psychology. Here is the definition used in a very recent review article on mental imagery: “We use the term ‘mental imagery’ to refer to representations […] of sensory information without a direct external stimulus” (Pearson, Naselaris, Holmes, & Kosslyn, 2015, see also Kosslyn, Behrmann, & Jeannerod (1995), Kosslyn et al. 1995, Pearson and Westbrook 2015, Pearson et al. 2008).
But this definition could be thought of as somewhat revisionary within philosophy. I want to emphasize that the concept of mental imagery we ended up with (that of perceptual processing not triggered by corresponding sensory stimulation) is an extension of the introspective concept of mental imagery that examples like closing our eyes and visualizing an apple lead to. But the definition of mental imagery as perceptual processing not triggered by corresponding sensory stimulation leaves it open what this perceptual processing is triggered by. I want to focus on cases where it is triggered by sensory stimulation in another sense modality.
What Is Multimodal Mental Imagery?
There is a lot of recent evidence that multimodal perception is the norm and not the exception—our sense modalities interact in a variety of ways (see Bertelson & de Gelder, 2004; Spence & Driver, 2004; Vroomen, Bertelson, & de Gelder, 2001 for summaries and O’Callaghan, 2008a, 2011 as well as Macpherson, 2011 for philosophical overviews). Information in one sense modality can influence and even initiate information processing in another sense modality at a very early stage of perceptual processing (even in the primary visual cortex in the case of vision, e.g., see Watkins, Shams, Tanaka, Haynes, & Rees, 2006).
A simple example is ventriloquism, which is commonly described as an illusory auditory experience influenced by something visible (Bertelson, 1999; O’Callaghan, 2008b). It is one of the paradigmatic cases of cross-modal illusion: We experience the voices as coming from the dummy, while they in fact come from the ventriloquist. The auditory sense modality identifies the ventriloquist as the source of the voices, while the visual sense modality identifies the dummy. And the visual sense modality wins out: Our (auditory) experience is of the voices as coming from the dummy. This is a demonstration of how information in two different sense modalities interact. But what I am interested in here is what happens if the information in one sense modality is missing.
When I am looking at my coffee machine that makes funny noises, this is an instance of multisensory perception—I perceive this event by means of both vision and audition. But very often we only receive sensory stimulation from a multisensory event by means of one sense modality. If I hear the noisy coffee machine in the next room, that is, without seeing it, then the question arises: How do I represent the visual aspects of this multisensory event?
We have a wealth of empirical findings confirming that our visual system in these circumstances does get activated (and even the very early visual cortical areas can, see Calvert et al., 1997; Ghazanfar & Schroeder, 2006; Hertrich, Dietrich, & Ackermann, 2011; Iurilli et al., 2012; James et al., 2002; Kilintari et al., 2011; Martuzzi et al., 2007; Muckli & Petro, 2013; Pekkola et al., 2005; Vetter, Smith, & Muckli, 2014; Zangaladze et al., 1999). There is early cortical activation in the visual sense modality without corresponding sensory stimulation in this sense modality. In other words, we have mental imagery. I call this form of mental imagery multimodal mental imagery.
Multimodal mental imagery is mental imagery that is triggered by sensory stimulation in another sense modality (see Lacey & Lawson, 2013 and Spence and Deroy 2013 for summary). Remember the definition of mental imagery in general: perceptual processing that is not triggered by corresponding sensory stimulation in the relevant sense modality. The last phrase now becomes really important. Mental imagery can be triggered by corresponding sensory stimulation as long as it is not in the relevant sense modality.
In other words, if perceptual processing is triggered by corresponding sensory stimulation in the relevant sense modality, we get perception, by which I mean here, and in the rest of the paper, sensory stimulation-driven perception. If it is triggered by corresponding sensory stimulation in another sense modality, we get multimodal mental imagery. If it is triggered by something else, we get some other kind of (non-multimodal) mental imagery. In short, multimodal mental imagery is mental imagery in one sense modality induced by sensory stimulation in another sense modality. And, as we have seen, we have strong empirical evidence that mental imagery in any sense modality can be induced by sensory stimulation in any other sense modality.
Given that most of the entities we encounter are multisensory entities and given that our perceptual access to these multisensory entities is rarely absolute (i.e., encompassing all relevant sense modalities), this happens very often. Multimodal mental imagery is the norm, not the exception.
Most of the time, when we form mental imagery of those parts of a multisensory entity that we are not acquainted with, this mental imagery will be unattended. But if we are really interested in them, we can attend to them. And while most of the time the properties we attribute to those aspects of the multisensory entity that we are not acquainted with are very determinable, we can make them more determinate (if we are really interested in them for some reason).
Suppose that I am working in my room and I hear footsteps from downstairs (without seeing who is coming upstairs). I represent the complex multisensory event of someone coming upstairs: I perceive the auditory parts of this event, and I represent the other (visual, maybe olfactory) parts of this event by means of mental imagery. But my visual and olfactory multimodal mental imagery may not be particularly salient—if I am not too concerned with who is coming upstairs. My olfactory mental imagery of the olfactory aspects of the multisensory event whose auditory aspects I am acquainted with is likely to be unattended and very determinable. But if the only two people who can come upstairs are my stinky friend X or my other friend, Y, who uses very nice perfume, and if I really want to know which one it is, I will be likely to fill in the olfactory aspects of the multisensory event in a more determinate way (which can prime me to recognize them by smell more quickly) (see Berger & Ehrsson, 2013, 2014 for more on the way mental imagery and multimodal integration interacts).
A brief terminological remark: The reference to multimodality in the label “multimodal mental imagery” does not refer to the multimodality of our phenomenology when we have multimodal mental imagery. What “multimodal” refers to in the name of multimodal mental imagery is the etiology of mental imagery: Mental imagery is the product of the interaction between (at least) two different sense modalities. The phenomenal feel of multimodal mental imagery, if there is one, may itself be unimodal, say, purely visual. But it is the outcome of the interaction between vision and another sense modality—it is multimodal in this sense.
A widely used and researched example of multimodal mental imagery is seeing someone talking on television with the sound muted. The visual perception of the talking head in the visual sense modality leads to an auditory mental imagery in the auditory sense modality (e.g., Calvert et al., 1997; Hertrich et al., 2011; Pekkola et al., 2005).
The multimodal mental imagery that is involved in this example is conscious, involuntary, accompanied by the feeling of presence and localizes in egocentric space. And this is very similar to the kind of multimodal mental imagery that blind subjects with sensory substitution devices have.
Back to Sensory Substitution
In the light of the characterization of mental imagery in general and multimodal mental imagery in particular, the claim that sensory substitution assisted “vision” is neither vision nor tactile perception should sound less surprising. If there is activation in the early visual cortices of the sensory substitution subjects, then they have multimodal mental imagery: Early cortical activation in one sense modality (vision) triggered by corresponding sensory stimulation in another sense modality (tactile or auditory perception).
And, as it turns out, there is indeed activity in the primary visual cortex of these subjects that was clearly not triggered by sensory stimulation in the relevant sense modality as the subjects were blind. But they were triggered by corresponding sensory stimulation in the tactile sense modality (Murphy et al., 2016; Renier et al., 2005a). 5 So perception by sensory substitution would count as multimodal mental imagery. It is visual mental imagery triggered by tactile sensory stimulation—it is multimodal mental imagery.
This is not an entirely novel angle in the sensory substitution debate. 6 Renier et al. (2005b) argue that subjects with sensory substitution devices “visualize.” This way of thinking about sensory substitution points in the same direction as the one I outlined here, but talking about visualization is misleading for a number of reasons (see also Martin & Le Corre, 2015 for a detailed criticism of Renier et al., 2005b).
First, visualizing is a voluntary and intended act, and it seems that sensory substitution assisted “seeing” is neither. Second, visualizing is something that happens in a top-down manner, whereas sensory substitution assisted “seeing” is only top-down inasmuch as normal vision is. Finally, the ultimate conclusion of Renier et al. (2005b) is that sensory substitution assisted “seeing” is in fact seeing. And they use this claim about visualization as a premise for establishing this conclusion (see, again, Martin & Le Corre, 2015's criticism). Talking about multimodal mental imagery, rather than “visualization,” in these cases fends off all of these worries.
Further, the empirical evidence that Renier et al. (2005b) use as support of their “visualization” account also supports my multimodal mental imagery account: Subjects with sensory substitution devices undergo the Ponzo illusion (see Figure 1): An illusion that is widely held to be a visual illusion. If sensory substitution subjects exercise on-line multimodal mental imagery, this is exactly what we should expect. Their visual perceptual processing is triggered by tactile sensory stimulation. But the perceptual processing happens in the visual sense modality. Thus, we should expect the usual oddities of this visual perceptual processing to be present—and they are.
Ponzo illusion.
A final potential objection needs to be addressed. One may worry that this way of thinking about sensory substitution ignores the importance of neural plasticity. Multimodal mental imagery is defined in terms of early cortical processing in one sense modality triggered by corresponding stimulus in another sense modality. But one may worry that this way of defining multimodal mental imagery presupposes neuro-chovinism when talking about sense modalities: The worry is that perceptual processing in the visual sense modality is identified physiologically, that is, with processing in the primary visual cortex.
And this way of thinking about the perceptual system is in tension with well-documented cases where the visual areas as recruited for other (e.g., tactile) tasks (Kupers & Ptito, 2014; Pascual-Leone & Hamilton, 2001). More generally, my proposal seems to go against the recently popular meta-modal brain hypothesis (Pascual-Leone & Hamilton, 2001), according to which sense modalities should not be identified physiologically, but rather functionally (see Kiverstein, Farina, & Clark, 2015 for the consequences of this view for the sensory substitution debate). So vision is not identified in terms of brain areas, but rather in terms of its function (something like: helping small-scale spatial discrimination). This is a fair worry, but it needs to be noted that nothing I say in this paper commits me to identifying visual processing in physiological terms. Everything I say here is consistent with a meta-modal angle, according to which we should identify vision in terms of its function. All that is needed to make sense of the phenomenon of multimodal mental imagery in general and of the claim that sensory substituted vision is a form of multimodal mental imagery in particular is that we have some way of distinguishing vision and touch. Just how we do so is something I do not want to commit to. 7
Perception or Multimodal Mental Imagery?
Why is it universally assumed in the literature that subjects who are assisted by sensory substitution devices do in fact perceive? One reason may be that the subjects’ perceptual processes are involuntary. But we have seen that mental imagery may be voluntary or involuntary. Another reason may be that the subjects localize the “visual” scene they navigate in their egocentric space—but, again, as we have seen, mental imagery may or may not localize in one’s egocentric space. Also, the report of these subjects is not entirely clear whether they have any feeling of presence of the visual scene in front of them. But even if they do, this would be consistent with the claim that they have multimodal mental imagery as mental imagery may or may not be accompanied by the feeling of presence.
An additional reason why one may be tempted to think that whatever sensory substitution can give us must be perception is that it helps us navigate in the world and it is clearly causally influenced, in an “on-line” manner, by the visual features of the world around us. How can it possibly be mental imagery then (i.e., something very much “off-line”)?
Some ways of exercising mental imagery do track the features of our surroundings in an online and causally responsive manner. Multimodal mental imagery in general tracks the features of our surroundings in an online and causally responsive manner: If the noises my loud coffee machine in the next room makes are changing, my multimodal mental imagery of its visual parts also changes. There is nothing about the definition of mental imagery that would exclude the possibility of tracking the changing features of our environment in real time.
A final reason for puzzlement about my claim that these blind subjects have multimodal mental imagery would come from the seemingly obvious assumption that given that blind subjects cannot see, they could not have visual imagery either. But this is just factually incorrect. It has been known for a long time that congenitally blind people also exercise visual imagery, and sometimes even very salient visual imagery (see Arditi, Holtzman, & Kosslyn, 1988 for a comparison of the imagery of sighted and congenitally blind subjects). In sensory substitution, this visual mental imagery is triggered by tactile input.
But then there is nothing mysterious about sensory substitution—there are no substantive philosophical questions about the individuation of the senses involved either. 8 Sensory substitution involves perceptual processing and very clearly visual perceptual processing—as the activity in the primary visual cortex clearly shows (and this coincides with the phenomenology of the subjects). And this visual perceptual processing is induced by tactile sensory stimulation—slight pricks on the subject’s skin. As clear-cut a case of multimodal mental imagery as it gets. If philosophers want some empirical findings that would help them in the debate about the individuation of the senses, they need to look elsewhere.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by ERC Consolidator grant 726251, FWO Odysseus grant G.0020.12N and FWO Research Grant G0C7416N.
