Abstract
This paper examines the issue of transparency recognition in architecture from a dynamically changing point of view. A simple and intuitive pictorial model of the transparent surface appearance is presented, followed by a discussion of the transparency processing (mid- or high-level). Two zones of perception that determine the proper perception of light-permeable surfaces are distinguished, each with their own phenomena responsible for the formation of transparency perception cues (transmittance/absorbance or reflectance). The main visual processing mechanisms are discussed, and some parallels to machine vision are found. Finally, a short checklist is presented to provide practical advice for architects dealing with large surfaces of glass in their projects.
Contemporary architectural transparency seems to go beyond all previously formulated definitions. It involves new materials and technologies, as well as new ways to assess the visual perception of architecture. With the introduction of the study of the dynamic perception of architecture, previously neglected qualities of transparent materials (e.g., reflectance and refraction) have become important factors in the perception of space, and these factors are processed on different levels in the recognition of transparent materials. The approach presented here stems from architecture-based experience, but is approached from the perspective of optics and vision science. The general principles are illustrated via diagrams and individual case studies.
It must be noted that the issue known as motion transparency, where “two overlapping surfaces move transparently over each other” (Snowden & Verstraten, 1999, p. 369), in other sources defined as “perception of more than one velocity field” (Qian, Andersen, & Adelson, 1994, p. 7357) is beyond the scope of this paper (Stoner, Albright, & Ramachandran, 1990).
Optics of Transparency
Transparent materials are perceived differently than opaque materials. In the case of opaque objects, the phenomenon of occlusion occurs: “visible contours of the more distant object terminate at the outer boundary of the nearer one” (Wilson & Keil, 1999, p. 844). Panes—large flat thin surfaces—of transparent materials are usually so smooth that the luminous fluxes that strike them are partially reflected, partially transmitted, and partially absorbed (absorption/transmission are linked). Each transparent object absorbs part of the luminous flux energy but simultaneously also generates specular reflections upon its surface. The optical model that approaches transparency from the position of reflected vs transmitted luminous flux balance (meaning: real and virtual images con-fused) seems to be sufficient and appropriate in consideration of optical cues distinctive for the visual system in the kinetic perception of transparency. A real/virtual image model is adopted for the sake of simplicity, as—due to the scale of the pane in architecture—other phenomena like “total internal reflection, and wavelength related effects (chromatic aberrations) caused by the refractive index of the object” (Ben-Ezra & Nayar, 2003, p. 1025) are invisible for human observers. A more complex and more accurate perforated model has been recently developed by the author, but seems to be redundant in the scope of current research (Brzezicki, 2013).
The condition of motion and modification of the observer's point of view affects the resulting con-fused image, as the pane's parameters change dynamically according to the rules described by Fresnel's equations (Westin, et al., 1992). In the practice of observation, a smooth light-transmitting pane is generally believed to be seen “through reflection from its surface” (Patterson, 2011, p. 31) and glass is considered to be “primarily a reflective material” (Elsener, 2005, p. 152), but this varies according to the observation angle. As within 45° from normal (a line perpendicular to the pane), reflectance is more or less constant, the transmittance/absorbance phenomena produce conclusive cues in the perception of transparency. When the threshold of approx. 45° is exceeded, the influence of virtual images rises. The reason for this is the reflectance “increasing dramatically above 70°, to almost 100% reflection at glancing angle” (Wigginton, 1996, p. 71). Thus, cues seem to be different: depending on the prevailing optical phenomena, two perception zones are created (see Fig. 1). Specular reflection and the resulting virtual image become conclusive for the visual system to recognize transparency unequivocally for viewing angles exceeding 45°. The condition of motion influences the mutual change of both parameters affecting the retinal image produced.

Perception zones: transmittance zone within the 45° angle from viewing axis, and reflectance zone beyond the 45° threshold. In general, transmittance remains conclusive for frontal observation (observer O1), while reflectance prevails for viewing angles exceeding 45° (observer O2). Schematic drawing by the author.
All surfaces making up the observer's environment (distal stimuli) are transformed by the optical and neural systems of the observer's eye into proximal stimuli. Processing of these stimuli is divided into “three basic stages” (Wilson & Keil, 1999, p. lvii), which could be referred to as levels. The analysis of sensory input “occurs in steps, each of them processing the information received from the previous level” (Lindsay & Norman, 1977, p. 136). The initial stage of perception is sensation, which takes place in the data reception phase called low-level processing. A “coherent representation of surfaces and materials” (Anderson, 2011, p. R978) is established at the second stage of processing, also known as mid-level processing. The third stage in the process of stimulus processing—high-level—is perception, i.e., identifying actual objects in the observer's environment. It involves the most complex cognitive processes and many high-level mechanisms, due to the constant flow of information between the mid- and high-level stages. The processes appear to be organized as “a set of recurrent loops, not a simple linear chain” (Anderson, 2011, p. R978). Assigning transparency to the surface is “often considered to be a higher-level visual process that (…) utilizes stereo and motion information to separate the transparent from the opaque parts” (Kersten, Bülthoff, Schwartz, & Kurtz, 1992, p. 573).
The perception of transparency is based “upon context to determine the most likely interpretation.” (Wilson & Keil, 1999, p. 845). In most standard, uncomplicated situations, perceiving transparency in architecture occurs at the mid-level stage and is based on transmittance/absorbance, as every light-transmitting object “perturbs the visibility of the underlying layer” (Anderson, Singh, & O'Vari, 2008, p. 1146). The visual system relies upon the changes in luminance of the background caused by transmittance/absorbance of the pane and the particular configurations of adjacent regions of the retinal image. Since retinal images are not unambiguous and the same image can be interpreted in numerous ways (e.g., see Anderson, 2003, p. 793), the visual input unrecognized at the earlier stages is processed further. Should any dubious circumstances occur, it is necessary for the visual system to upgrade to high-level processing, when “identification and classification on the basis of previous experience” is carried out (Wilson & Keil, 1999, p. 64). In the context of transparency perception, the basic difference between mid- and high-level cognitive processes is believed to lie in the processing of optical phenomena occurring at the interface of materials with different refractive indices: specular reflection and light ray refraction (Ben-Ezra & Nayar, 2003). In the absence of basic transmittance-based cues, other optical phenomena exist that could be utilized by the visual system. These include reflectance and resulting specular highlights.
Transmittance-based Kinetic Perception and Occlusion
The visual system recognizes transparency as a “special case of superposition” (Arnheim, 1971, p. 257). As a result of a ray of light being transmitted through the transparent pane, both the luminance and the contrast of the image behind it change. Since a homogenous transparent pane absorbs a constant amount of light, the luminance of the obscured part of the image changes at the same rate within its entire area: various regions of the background are attenuated to the same extent, creating junctions at the edges, comprehensively classified by Kanizsa (e.g., X-junction: Kanisza, 1979). Based on specific cues—the unique relation between intensities of neighboring regions of the retinal image—a scission process is performed by the visual system, causing the transparent layer and the background to become spatially discriminated (Metelli, 1974). As a result, depth can be recognized: two planes—“one transparent and one, i.e., located in the background” (Anderson, 2003, p. 794). In transparency studies, since the beginning of the 20th century, this mechanism has been called perceptual decomposition (Metelli, 1974; Kepes, 1995). Research “suggests that the brain, in some sense, ‘understands’ the physics of transparency” (Anderson, 2011, p. 980). A vital discussion is still taking place on the issue of what is the “critical image variables in order to assign surface attributes to transparent layers” (Singh & Anderson, 2002, p. 533); or “what it is that the brain understands” (Anderson, 2011, p. R980). The first mathematical model of this process was presented by F. Metelli (Metelli, 1974) in 1974, but was later questioned by numerous researchers. This yet unresolved problem continues to present a challenge (Masin, Tommasi, & Da Pos, 2007; Masin, 2011), but lies beyond the scope of this paper.
Kinetic occlusion occurs “when nearby objects hide (or reveal) others beyond them as a result of observer or object motion” (Wilson & Keil, 1999, p. 255). In the case of superposition of transparent objects perceived in motion, elements of the see-through background are systematically attenuated by the occluding transparent object in the sequence, resulting from the direction of motion of the observer or the object. This motion affects the point of view and, according to the laws of perspective projection, creates an apparent mutual displacement of observed objects within the field of view (Norman, Todd, & Phillips, 1995). This process is called ‘structure from motion’ (SfM; Loomis & Eby, 1989) and describes the human ability “to perceive the 3-D shapes of moving objects on the basis of a succession of 2-D projections” (Loomis & Eby, 1988, p. 383) on the retina. If the elements of the background are not attenuated in the right sequence (as in parallax), the loss of the notion of transparency is inevitable, as the visual system tends to recognize the scene as flat, rather than 3-dimensional (no scission). Experiments with various grey-level patches were previously reported as producing the illusion of “perceptual transparency” (Singh & Anderson, 2002, p. 532); see Fig. 2.

Dynamic perception destroys the illusion of transparency produced by various correctly arranged grey-level patches, as the background attenuation does not correspond to the expected motion parallax. Schematic drawing by the author.
The change in luminance of the background behind the transparent pane is the main indicator of light transmission; however, as R. Arnheim observed, “if the shape of physically transparent surface coincides with the shape of the ground, no transparency is seen” (Arnheim, 1971, p. 258), or “in order to recognize any figure, one must first perceive the background, from which this figure emerges” (Dominiczak, 2003, p. 57). This condition seems rare, but is in fact common in the practice of architecture observation. Large smooth panes often fill the entire field of view, especially in the proximity of buildings. The transparent panes in shopping windows, doors, and freestanding screens might serve as a good example of this. The evaluation of transparency is very difficult, especially within a 45° angle from normal. The lack of an unoccluded reference surface and the absence of a virtual image causes the entire pane to ‘vanish.’ The glass simply ceases to be noticed by the observer (see Fig. 3). Only the observer's motion, followed by the change in point of view, could provide information about the reference area or the virtual image, and contribute to the correct recognition of its transparency. This is the main reason why, in certain countries, there are building regulations that oblige a building's owners to put clear visual manifestations “at two heights, 850–1000 mm and 1400–1600 mm above the finished floor level” (Glazing – safety in relation…, 2000, p. 6) on glazed doors and other transparent surfaces. This is crucial, especially in the case of transparent surfaces arranged in large screens, with edges located outside of the observer's field of view.

High transmittance (low absorbance) and frontal perception exclude all possible cues for the perception of transparency. The entire glazed wall seemingly vanishes (A). The perpendicular pane (the same type of glass) is thus perceived based on an almost exclusively virtual image (B). Case study: Conservatories of Cité des sciences et de l'industrie in Paris, France (A. Fainsilber & P. Rice, 1986). Photo by the author.
In high-level processing, the visual system uses a number of comparative mechanisms to identify transparent surfaces. The most important mechanisms are those that perform content analysis of the retinal image; e.g., detect alterations of the image transmitted through the pane. In case of distortion, image identification is performed in the final stages of perceptual classification of the observed object. It is at this stage that “sensory data is confronted with mental data” (Necka, Orzechowski, & Szymura, 2006, p. 279); i.e., with the mental representation of the object retained in the observer's memory. Two existing theories, the template (e.g., see Bülthoff & Edelman, 1992) and the feature-based theory (e.g., Lindsay & Norman, 1977) lie beyond the scope of this paper.
Light ray refraction is usually caused by an unevenness in the transparent pane's relative thickness (pane defect) or a deliberately figured pane structure (e.g. in ornamental glazing etc.). As “transparent object does not have features of its own” it simply maps “features that exist in its environment onto the image plane” (Ben-Ezra & Nayar, 2003, p. 1026). Thus, refraction-based perception of transparency is the result of interpreting a sequence of distorted images of the background, observed through light-refracting media. This sequence results from the observer's or object's motion. A single (static) distorted image is insufficient, as the apparent distortion could have been an intrinsic feature of the background. Along with the processing of the distorted background, a hypothesis is made that there exists a transparent medium generating this distortion, and—most notably—that its physical form is being reconstructed simultaneously (e.g., negative or positive lens). The mechanism of this activity is complicated, but was recently modeled as part of the research concerning machine vision (Ben-Ezra & Nayar, 2003). It is assumed that the visual system relies on the observation that all rays associated with a single feature of the background “are parallel to each other before they are scattered [refracted] by the transparent object” (Ben-Ezra & Nayar, 2003, p. 1027). This allows the visual system to reconstruct the shape of the refracting object by measuring the parallelism of perceived rays. Simply put, the manner in which the background changes confirms the existence of transparent media.
Image distortions occur in both the real image transmitted through the pane (background distortions – see above) and the virtual image (specular image distortions – see below). In both cases, the transparency perception mechanisms based on deformation are similar and lead to the awareness of the presence of a medium that refracts rays of light.
Reflectance-based Kinetic Perception
Materials with rough surfaces bounce the light off multidirectionally—reflected light is scattered due to the unevenness of the surface. Smooth materials interact with light in a different manner: they reflect rays of light unidirectionally, in a process known as specular reflection (speculus in Latin meaning: mirror). A single incident ray of light is “reflected into a single outgoing direction” (Shah, 2007, p. 43) resulting in the occurrence of a laterally inverted virtual image. The visual system relies on virtual images in the process of identifying smooth (glossy) objects.
In the case of transparent objects, the real image transmitted through the pane is overlaid with the virtual image reflected off the pane. Flat panes generate specular (mirror-like) reflections, while curved shaped glossy objects generate highlights. Thus, select areas of the retinal image are dominated by the real or the virtual image, while other areas will form “an inextricable blend of the two” (Downing, Liu, & Kanwisher, 2001, p. 1334). These distinctive features of the retinal image are not strong enough to ensure the recognition of transparency in stasis, as what seems to be the virtual/real image superposition could have been a pattern applied onto the object (Nayar, Ikeuchi, & Kanade, 1989). In the condition of motion, the virtual reflected image and the real transmitted image would shift in relation to each other and according to the observer's displacement (see Fig. 4). The individual ratio of this shift would depend only on the object-pane distance, regardless whether the image is real or virtual (although the latter is laterally inverted).

Both virtual and real images would overlap and shift relative to each other and according to the observer's motion. Observer O3 sees the tree on the left side whereas the observer O2 sees the tree on the right side of the house.
Kinetic changes of the pane's parameters can also lead to an unequivocal interpretation of incoming superposed images. According to the previously mentioned Fresnel's equations, the rays reflected of the surface increase with the rise of the observation angle. In the condition of motion, the ratio of superposed images visibly changes, as “reflection and depth are the two poles of the perception spectrum of glass” (Elsener, 2005, p. 151). A gradual transition of the pane from reflection to transmission serves as a good example of this. When the observation of a pane begins at an oblique angle, the pane appears to be reflective; from this point of view, the observer cannot tell the difference between a mirror and a transparent surface, as both look alike (left panel of Fig. 5). As the angle of viewing decreases, reflection is reduced and the transparent quality of the pane progressively appears (right panel of Fig. 5). In this case, transparency cannot be properly recognized through static perception; a change of the viewpoint is necessary to establish whether the pane is a flat mirror or a transparent surface. Without motion on part of the observer, proper identification would not be possible.

From an oblique point of view, the observer cannot tell the difference between a mirror and a transparent surface, as both look alike (left). With the change in viewpoint, the transparent quality of the pane progressively appears (right). Case study: Tetragon Office Building in Frankfurt, Germany (Bundschuh/Baumhauer Gesellschaft von Architekten mbH, 2003). Photo by the author.
In the case of dynamic transparency perception, a segregation of superposed images is vital to perform the scission and achieve spatial discrimination. In 1999, an interesting experiment was performed by Downing, et al., in which two images were superposed, so that they were “transparently overlapping at the same location” (Downing, et al., 2001, p. 1333). One of the images was dynamic, while the other remained static, thus achieving an effect similar to a real life kinetic scenario. The observer was to direct his attention to one of the stimuli, thus activating the appropriate areas in the temporal cortex: FFA (Fusiform Face Area) when observing faces, or PPA (Parahippocampal Place Area) when observing buildings. The experiment shows that the visual system is able to perform a mental switch from receiving one type of information to another, thus making the spatial discrimination of surfaces possible, in order to perceive transparency correctly. According to the authors, the process of image segregation proceeds as follows: the object (real or virtual), being in the focus of attention, is recognized via high-level processing and the information underlining the features corresponding to the recognized object goes back to low-level processing (where accommodation occurs). The top-down processes guide attention in such a way as to “allow selection and grouping of the unambiguous features” (Downing, et al., 2001, p. 1334) of the objects fitting the already-recognized object, based on a pre-existing mental representation in the visual system. This exchange of information is performed repeatedly, until proper recognition is achieved.
Virtual images of the environment that appear on smooth surfaces are used by the visual system as a cue for recognizing the type of material: e.g., the transparency. As specular reflections (virtual images) are “simply images of the environment (…) distorted by the geometry of the reflecting surface” (Anderson, 2011, p. 980), the virtual image can be perceived and processed just like a real image would be. For smooth surfaces, “the light reflected toward the point of observation (…) produces the appearance of specular highlights” (Norman, et al., 1995, p. 629). Within the retinal image, these highlights—areas of higher luminance—are distinctly different than the observed object. When curved smooth objects are considered, the image contained within these particular areas is difficult to process due to the significant distortion of the image. In return, their location alone can constitute an important cue for the visual system, as the highlights make it easier to build a three-dimensional mental model of the object. The highlights also “affect observers' judgment of surface gloss” (Blake & Bülthoff, 1990, p. 165; Blake & Bülthoff, 1991, p. 245), making it easier not only to build a three-dimensional mental model of the object (see Veeraraghavan, Tuzel, & Agrawal, 2010), but also to “provide a general constraint for the estimation of material properties” (Motoyoshi, 2010, p. 8). It seems reasonable to theorize that transparency recognition is the result of the perception of highlights according to the mechanism, where shine and gloss equal transparency. As already discussed, the particular “relationship between specular highlights and non-specular shading patterns is a robust cue for the perceived translucency and transparency of three-dimensional objects” (Motoyoshi, 2010, p. 1). In the condition of motion, these specular highlights “tend to move very rapidly over relatively flat regions of a surface and to cling more stably in regions of high curvature” (Norman, et al., 1995, p. 629). This velocity of the highlight's displacement might be used by the visual system to reconstruct the shape and the spatial orientation of an object (Blake & Bülthoff, 1991), based on highlights only “in the absence of any identifiable features.” (Norman, et al., 1995, p. 635). Highlights, especially during the nighttime, serve as a relatively reliable cue for the assessment of the type of material in architecture (see Fig. 6).

Visible connection of gloss and transparency (translucency). A visible relation of specular highlights and non-specular shading patterns. A glossy, shining surface that communicates the transparent type of material during the nighttime. Case study: Kunsthaus in Graz, Austria (P. Cook & C. Fournier, 2003). Photo by the author.
Conclusions
Dynamic perception is essential for the proper perception of transparency. It enables the visual system to collect more data before the recognition is concluded. Due to the ambiguity of the retinal image, kinetic perception is required to distinguish whether the object perceived is a flat pattern (e.g., patch composition) or a bona fide transparent element. Motion is not crucial for the scission process to begin, but facilitates the detection of any potential illusions.
In architectural scale, two vital zones of perception are created, each with its distinct phenomena responsible for the formation of transparency perception cues: absorbance/transmittance and ray refraction in frontal observation, reflectance and its local changes (highlights) at oblique viewing angles. As the observer of architecture actively moves during perception, the phenomena described should be carefully studied if the architect's intent is to design a truly transparent and safe building. The understanding of the mechanism analyzed above helps to determine the circumstances in which a building would be perceived and recognized according to the designer's intentions.
The following checklist should serve as practical advice for designers dealing with large surfaces of transparent materials:
Try to assess the luminous fluxes that will affect the transparent panes in the building, both during daytime and nighttime. Consider the sun path, backlighting, and reflections from surrounding buildings. Create a light map, depicting the possible shaded and well illuminated areas.
Study different points of view, as many as possible, equivalent to real observers' viewpoints. Consider any possible prevailing optical phenomena that could potentially influence perception.
Identify the areas from which the building's transparent surfaces are perceived at oblique angles, and where frontal observation occurs. Anticipate any possible perceptual mistakes that could generate a potential hazard, advocate for application of proper manifestations on transparent panes.
Try to visit the site. L. Mies van der Rohe analyzed glazing appearance using small scale models of buildings exhibited on the building site itself (Colomina, 2009, p. 82). Only in this way can real luminous fluxes and the resulting reflections of the surroundings be observed.
