Abstract
Full-horizon cylindrical projections of the optic array are in common use. One wonders whether the public actually profits from such pictorial information, since the space behind one’s back does not exist in visual awareness. In an experiment, a test image included six persons located at the corners of an irregular hexagon centred at the camera. Two persons faced the camera, two turned their back to the camera and two others faced a direction at right angles to the camera. The distances to the camera were unequal and varied from 1 to 2 m. Participants were asked to draw a ground plan of the perceived configuration, including actors and camera, on the basis of viewing the picture. As with any picture there exist many possible interpretations, the ambiguity grows even more when the angular scope of the picture is unknown. Almost all naïve viewers parse this planispheric (Mercator) representation so as to have the whole scene in front of them, with the actors standing in a circle, facing each other. They take the viewpoint to be outside the circle. Only a few placed the viewpoint inside the circle, which is indeed another reasonable interpretation (in this case the actual one).
Introduction
Full optic array (Burton, 1945; Gibson, 1950; Koenderink, Albertazzi, et al., 2010; Reid, 1819) panoramic cameras are becoming increasingly available and popular.
Naïve observers have considerable problems to deal with very wide-angle (but less than 180°) images (Attneave & Farrar, 1977; Koenderink, van Doorn, de Ridder, & Oomes, 2010; Phillips & Voshell, 2009).
In the common full-horizon, cylindrical projections (e.g., the popular equirectangular or equidistant rectangular, French: plate carrée, German: quadratische Plattkarte; Snyder, 1993) observers routinely ignore the periodic structure (Koenderink & van Doorn, 2017). This happens even when interactive panning is available. In popular ‘rolling ball’ dynamic, interactive renderings one routinely confuses the interior and exterior orientations (Koenderink & van Doorn, in press). The problems are due to a lack of intuitive grasp of the topology, the fact that the horizon is a closed curve and that the viewing sphere is viewed from the inside, whereas rolling ball graphics show the outside.
One might say that the full horizon is a ‘magic circle’ that – in depictions – cannot be entered by the observer (Buckland, 2002). Something similar applies to the full viewing sphere. In this contribution we explore that notion in more – also quantitative – detail.
In this article, the emphasis is on static images of the familiar ‘postcard’ variety, that is, roughly A5-size viewed informally at normal reading distance. The most popular static representations are cylindrical projections with the horizon represented as a (straight) horizontal line. Various cylindrical projections, in which the verticals are rendered as vertical straight lines, are in common use. In this experiment the Mercator representation, a planispheric, conformal map is used (Mercator, 1569). Its conformal property is nice whereas global deformations are limited if the elevations are not too close to either zenith or nadir. 1
Such a cylindrical full-horizon rendering yields a rectangular picture. Its aspect ratio depends on the range of elevations. So we skip two ‘polar-caps’ centred at zenith and nadir. In practice, we decided on a desired aspect ratio (postcard format) and let that constrain the extreme elevations. At first blush such a picture appears as a regular postcard.
In previous experiments (Koenderink & van Doorn, 2017, in press) we showed that human visual awareness is simply unable to deal with such renderings. In these experiments we tried hard to make the visual task as simple as possible for the observers. In contradistinction, in the present experiment we intentionally set up a spatial configuration in such a way as to ‘fool’ the observer: By having the pictorial content conform to a familiar scene we enforce the ‘generic postcard’ situational awareness and thus silently suggest an inappropriate scope. It is indeed not hard to design spatial configurations that will almost certainly be perceived in some specific manner by the overwhelming majority of naïve observers. This is just intuition; there is no science of the matter. This ability might well find applications in – for instance – the movie business.
Well-documented misinterpretations (Koenderink, van Doorn et al., 2010; Koenderink & van Doorn, 2017) are due to the fact that a postcard type of image is routinely interpreted as the rendering of a ‘normal’ field of view, roughly spanning
In the case of extremely panoramic (horizontal field of view exceeding
The effects can often be predicted at least semiquantitatively from simple models of the structure of visual space, as we have shown for extreme wide-angle views (Koenderink, van Doorn et al., 2010). To extend this to cases that include the space behind the back is such an extreme extrapolation as to be a shot in the dark. As we show here, it actually works quite well. Since most readers will be unfamiliar with the log-polar visual space model (Koenderink & van Doorn, 2008), we succinctly summarise the basics in the following.
Methods
The aim was to design a spatial configuration in such a way that naïve observers would be grossly deceived in their ‘reading’ of the picture. We aim at categorical mistakes, rather than mere (even if large) quantitative errors. The aim was to produce a planispherical map that could readily be confused with a regular photograph showing a ‘normal’ scope of 40° to 80° instead of the full 360°.
Design of the Configuration
One would like to apply the Ames room technique (Ittleson, 1952) to create optically equivalent configurations. However, in the case of full-horizon renderings this is evidently not possible, because the Ames room technique conserves the identity of ‘visual rays’ and merely shifts positions on these rays. For the present application one also needs to change the visual directions. This can be done using a general model of visual space (Koenderink & van Doorn, 2008).
We have shown the power of this model for the case of extreme wide-angle (e.g., fish-eye lens) views (Koenderink, van Doorn et al., 2010; Koenderink & van Doorn, 2017). Effects are huge, observers committing misjudgements of the spatial attitude of pictorial objects exceeding
Here we go categorically beyond that, including the space ‘behind the camera’. There exist no observations on such cases, so we are in no position to predict how this will work out. The difference is indeed categorical because human observers are physiologically limited to the optical space in front of them (Helmholtz, 1896). Going beyond that even introduces changes of a topological nature, so all bets are off. We aim at a design that might be interpreted in two, mutually very different, ways: In one interpretation, the camera is at the centre of a group of people and in the other interpretation, the camera is outside the group. The mutual spatial attitudes in the two interpretations are also very different. Here is the design:
Consider a group of six persons facing each other, positioned on the vertices of a regular hexagon. In the case of a ‘normal’ photograph, the camera will be outside the hexagon. Usually, the distance of the camera to the group will be large with respect to the diameter of the hexagon, on an axis of bilateral symmetry, with three persons on each side of the main camera direction. Then, two persons will roughly face the camera, two will turn their back to the camera and two will be seen in profile, facing the camera axis. The persons facing the camera will be at the largest, those turning their backs to the camera at the shortest and those seen in profile at some intermediate distance from the camera.
For a full 360° panoramic view, one obtains a configuration in the ground plan shown in Figure 1 left. (Figure 1 right shows two areas defined by the points of the hexagonal configuration that will be discussed later in the article.) We picked a minimum distance of 1 m in order to obtain an angular height of about Left: Ground plan for the photograph used in the experiment. The camera is at the central orange disk. Notice that the actors are at distances of either 1 m, 141 cm (
In Figure 2 we show a simulation on a regular tiled floor with square tilings in Mercator projection, thus fully exhibiting the actual layout. Yet, what we see (as colleagues that were shown the image) is a group of actors arranged in a circle, facing each other. This suggests that observers are generally unable to make good use of such images, even if they know what they are looking at. We already know that from experience with full-horizon images in various settings (urban scenes, natural landscapes and indoor scenes). Given the growing use of full-horizon images, it is important to gain some notion as to the degree of veridicality of their immediate impressions.
An artificial scene. The ‘actors’ (mutually identical Daruma dolls) are placed on an infinite floor with regular square tiling. By noticing the vanishing points of the grout lines at the horizon, you easily figure out the extent of the field of view. By following the curved grout lines (which are actually straight lines), you may figure out the actual orientation of the figures with respect to each other. One should be able to work out the actual layout, all the necessary cues are explicitly present. Few observers can do this a prima vista though.
In this report we test an actual image on a large number of observers. This allows us to arrive at expectations of the kind of configuration observers are likely to report. Such expectations depend on two generic principles, namely: firstly, the spatial attitude of visual objects is judged with respect to the local line of sight, and secondly, depicted scenes are taken to be located in a half-space in front of the viewpoint. The scene designed here can be interpreted in an infinity of ways, although two might be termed ‘cardinal’. In a later section these cardinal views are compared in some detail. In the experiment we attempt to quantify the probability of obtaining one or the other cardinal view. As argued earlier, we expressly set up the scene to suggest one of these. The depiction of the full-horizon scene is typical for the images that are already widely used by the general public.
The configuration was set out in the Leuven city park. The locations of the actors were previously constructed with the help of a measuring tape and indicated by hammering tent pegs in the ground.
In Figure 3, it is shown what was in front and what at the back of the camera. Here the ‘frontal direction’ has only a formal meaning (indicated in Figure 1 left), since the camera itself has no preferred ‘viewing direction’ but is fully isotropic. The stereographic projections each show a full half-space. Notice that their (circular) outlines coincide.
A stereographic map of what is ‘in front’ (left) and what is ‘at the back’ (right) of the camera. The scene layout is shown in Figure 1 left. The zenith and nadir are represented at top and bottom points in either picture. The circular outlines of the pictures coincide. Like the Mercator map, the stereographic map is conformal, but not area true.
The photograph destined to be used as the stimulus was taken with a Ricoh Theta S panoramic camera from roughly average navel or breast-height. Thus the horizon will roughly bisect the vertical extent of the figures, which is desirable in order to balance the magnification increase with visual height of the Mercator projection.
The camera was remote controlled by means of an iPhone, so the photographer does not appear in the photograph. The Mercator map (see Figure 4) was computed by way of a simple program written in Processing 3. It can hardly be distinguished from the raw camera image (an equirectangular map).
The photograph used as stimulus in the experiment. The ground plan of the physical configuration is shown in Figure 1 left. The projection is Mercator, with the horizon at the half-height of the picture. Notice that the actors are equally spaced on the horizon. The pictorial vertical extent of the actors varies by about a factor of two (the persons were not of exactly the same size). The actors are seen in either anterior, posterior or lateral (profile) view. The location of the feet is another cue to distance: since the terrain was roughly horizontal, higher in the picture plane indicates greater distance from the camera. The picture has a periodic topology in the sense that the left and right edges show the same direction of view (purely posterior). Zenith and nadir are at 
Definition of coordinates and angles
The geometry of the experiment is sufficiently complex that it may be confusing for most, thus we clarify some definitions. The reader is invited to use Figure 5 as a reference.
Top left. The azimuth is defined with respect to the forward direction defined in Figure 1 left, angles being counted in the clockwise direction (in this figure, all ranges have been set to the same value). Top right. This is a sketch of how the configuration appears to most observers, except that the ranges and azimuths have been set to their veridical values. Bottom. Gaze directions are reckoned with respect to the local direction of view, thus they are zero for actors in the dorsal, 
The figure at top left shows the definition of the azimuth, the figure at top right shows an idealised response in which the azimuths and distance ratios are veridical and the figure at bottom shows the definition of the gaze angles. In the figure at top right we also indicate indices for the locations, which often come in handy in discussion.
Just for exercise, in the figure at top left, the gaze angles for Indexes 1 to 6 are
Design of the Experiment
In the final experiment, observers were handed a piece of A4 paper (see Figure 6) in portrait orientation with the Mercator map as shown in Figure 4 printed in the upper half. They could fill in their personal data (name, age, gender and date) on a line at top. The lower half had an outlined, square, empty area with the instructions in the right margin. No further instruction was provided.
A sheet as used in the experiment. At top the stimulus and at bottom right the instructions. At bottom left the drawing area (which blank at the start of a trial). Here we show a typical drawing, this is the ‘response’. (All responses made available as a movie on the publisher's website). Notice that the method is self-documenting.
All analysis was done on the basis of the drawings, a typical one being shown in the drawing area of Figure 6. These drawings were digitised by hand using a program specially written for the occasion. The centres of gravity of the person marks and the camera marks were judged by eye.
Observers
Observers were students and staff at the institutes of experimental psychology of the universities of Leuven and Giessen. Median age was 32 and interquartile range was 27 to 37. The gender ratio was 39% female. A total of 61 persons participated in the experiment, 24 from Leuven and 37 from Giessen.
Observations
Participants had no conceptual problems with the task. As expected, no one asked what the field of view was. (If anyone had done so we would have offered that information.) Problems that occasionally occurred were due to the familiar fact that people often underestimate the final size of their drawing, thus finding themselves short of free space at some phase in the process. When the camera was drawn outside the frame, this was considered acceptable.
In cases where they inquired in retrospect whether they were ‘right’, they often had trouble to make sense of the physical ground plan and to relate it to the picture. Even after ‘knowing the solution’, no one could intuitively ‘see’ it. Quite a few even failed to understand the ‘solution’ at all. This lack of ability to ‘see’ the actual configuration also holds true for the authors: Although we obviously know exactly what the actual scene was like, we can only relate that to the picture in reflective thought, using geometrical and logical reasoning, we cannot spontaneously see it. In that respect it is similar to the classical geometrical illusions, where knowing the actual geometry does in no way help to get rid of the illusion.
The drawings were digitised into ordered lists of Cartesian coordinate pairs. All subsequent analysis was done on these data. 2
Analysis
Various types of analysis can be done on this data. However, we are careful not to overanalyse the data in view of the fact that participants delivered quick sloppy drawings and did not use any drawing or measuring instruments.
Initial Categorical Checks
At an initial stage we removed four items from the list because obviously they were incoherent. In one case this was known to be due to language problems. Formal reasons were such factors as, for instance, a number of actors different from six, cases of all actors facing the camera, evidently in conflict with the pictorial content, or extreme long time to reflect on the response. This left 57 cases.
In 55 out of 57 cases (96.5%), the camera was located outside the convex hull of the actor locations. With a Bayes factor of 9.5 we have ‘substantial’ evidence (using Jeffreys prior and scale) for the fact that the observers located the camera at some distance in front of the group.
Another coarse check involves the nose directions. In 50 out of 57 cases (88%), all noses pointed into the interior of the convex hull. With a Bayes factor of 8.9 we have again ‘substantial’ evidence for the fact that all observers faced into the interior.
Parameter Estimates
A very simple starting point is an overall average. First we computed the mean and covariance matrix for the six actor locations. The drawing was then translated, rotated and scaled to place the mean at the origin, the orientation of the largest variance horizontal, with the largest variance equal to 1. This brings all results into a common format. This is a necessary stage because observers used different placements, orientations and sizes in their drawings.
We then averaged over the full group, using circular statistics for the nose directions. In Figure 7 we show the result. The ellipses for the locations and sectors for the directions have been drawn at one standard deviation.
The normalized (see text) and averaged configuration. The camera position is indicated in orange, the actors in pink and the nose directions in blue. The thick black line suggests the spatial configuration of the group, the thin black lines the directions of the actors as seen from the camera location.
The result is encouraging, for it shows that the normalised data is quite homogeneous. This result is perhaps even better than expected.
The apparent scope
One simple measure is the apparent angular scope of the configuration as subtended by the group seen from the camera location. The median is 104°, with an interquartile interval of 89° to 119°. The histogram is shown in Figure 8.
Histogram of the apparent scope.
With a median wider than a right angle, the scopes are quite wide. However, it is evident that the scope is certainly less than 180°. Thus, virtually all participants saw the picture as pictorial content ‘in front of them’ or perhaps ‘behind the picture plane’. This is a major qualitative result.
Aspect ratio
For this initial pass through the data we defined the ‘aspect ratio’ as the square root of the ratio of the eigenvalues of the covariance matrix of the positions in the normalised data. The eigenvectors are almost all (exceptions noted later) in the anterior–posterior and left–right orientations, so this is a reasonable estimate in most cases.
The majority of the configurations have an aspect ratio quite different from 1. In all cases, the elongated hexagon is oriented with its major axis at right angles to the principal direction of view (see Figure 7). The configurations are extended in the lateral orientation or – equivalently – flattened in the (frontal) depth direction.
The median aspect ratio is 0.46 and the interquartile range is 0.34 to 0.59. In Figure 9 we show a histogram of the distribution. (In the discussion we have occasion to address the aspect ratios from another perspective.)
Histogram of the aspect ratios of the hexagonal group.
Depth ratios
The range ratios (camera to actors) are either 1, Distributions of depth ratios. Notice that the cases of 
The unit range ratios are
For the range ratios 2 we use
The range ratios square root of 2 show a bimodal distribution. Here we need to distinguish between the near and the far range. In the near range we consider the ratios
Thus the results for the
Apparent azimuths
The azimuths at which the camera sees the actors are spaced at
Figure 11 shows a plot of the observed azimuths against the veridical ones. Apparently, the observed azimuths of the outermost actors are highly compressed. This is already evident from the average plot (Figure 7).
A plot of the observed azimuths against the veridical ones (Remember that 
Nose directions
The deviations (median observed nose direction minus the veridical direction) have been plotted against azimuth in Figure 12. (The figure also shows the interquartile ranges.) Clearly, the observations differ appreciably from veridical.
The deviations from veridical of the observed nose directions, plotted against the azimuth. The dots are medians, the interquartile range is also indicated, though hardly apparent.
Discussion
First some special cases are discussed and a discussion on generic results follows. Although the special cases get some attention here, one should not forget that the bulk of the data is best represented by the averages shown in Figure 7. The special cases form only a small minority. That is not to say they are not of major interest though. Of the two interpretations suggested in Figure 1 right (a fuller discussion follows later) one has the viewpoint interior to the circle of actors and the other exterior to that circle. (Read ‘in the round’, or something like that for ‘circle’ here, we obviously do not intend a perfect geometrical circle.) Both are quite reasonable interpretations. Whereas it is indeed very interesting to note that the large majority voted ‘exterior’, it is highly relevant that some voted ‘interior’: It shows that participants have a choice. 3 A formal analysis also reveals two categorically different but optically equivalent (Ittleson, 1952) interpretations. Thus, it is a notable fact that generic observers have a very pronounced preference for one of these.
Inside the Magic Circle
There were two observers who entered the magic circle (see Figure 13). These are mutually very different cases.
The two observers who entered the magic circle. Locations and directions of the nose are indicated by the white dots with thick black line elements. The cross denotes the origin of the (normalised) coordinates. The black dot indicates the camera location.
In the case of Observer 44 we see a standard configuration except for the location of the camera. Here the view is from outside the magic circle, whereas the camera has been placed inside. Of course, this latter placement is fully inconsistent with the configuration. The fact that two actors evidently had their back turned towards the camera is completely ignored. This drawing is possibly a mixture of what the observer actually saw and some aspect of what the observer knew about
The case of Observer 51 is more interesting. This was in fact the only observer that entered the magic circle in a qualitatively consistent manner, the inconsistencies (which are not at all minor) being of a quantitative nature. This observer marked the actors both in the stimulus and in the drawing, so we can be certain that the two noses pointing outwards belong to the actors seen from the dorsal side and that the noses pointing at right angles to the visual direction belong to the actors seen in profile. The responses are indeed quite consistent, except for the fact that the distance ratios have been fully ignored. Apparently the urge to see a ‘circle’ was much stronger than the additional visual evidence.
There was actually another observer among the ones left out from the data set who started with drawing a circle, putting the camera at the centre, the actors in a regular hexagon – fully ignoring angular size and horizon dip cues. The observer then rounded this off by putting the nose directions as would fit a view from far out of the magic circle. In this case the drawing was evidently made on the basis of guesses in reflective thought as was also evident from the time taken to reflect (the reason for the decision to ignore this response).
Convexity
There are three nonconvex responses, two of these trivial, because of minor sloppiness in drawing, the remaining one is interesting and merits mention (Figure 14).
The response of Observer 17. Locations and directions of the nose are indicated by the white dots with thick black line elements. The cross denotes the origin of the (normalised) coordinates. The black dot indicates the camera location.
The configuration is evidently nonconvex and the scope is unusually large, about
We do not discuss the convexity issue further at this point, since it will be more fully treated in terms of the log-polar model introduced later. (A configuration that is nonconvex in physical space may well be convex in visual space.)
The Overall Qualitative Outcome
The major result of the experiment is neatly summarised in the average results depicted in Figure 7. Apparently the actors are perceived as arranged in a slightly flattened hexagon, its long axis perpendicular to the principal direction of view (defined in Figure 7). The camera is located outside the hexagon. All nose directions point into the interior of the hexagon. Thus the experience is of the magic circle as seen from the outside.
Such a result is not unexpected (Koenderink & van Doorn, 2017, in press). The impression cannot even be said to be ‘non-veridical’ in the sense that the picture itself is not a touchstone for veridicality. The result is due to these major factors:
^ observers implicitly assume some finite scope, in many cases related to the extent of their apparent field of view (Koenderink, van Doorn, & Todd 2009). The latter is almost invariably smaller (usually much smaller) than a half-space; ^ observers relate the apparent spatial attitude of pictorial objects to their local (apparent) visual directions (Koenderink, van Doorn et al., 2010); ^ observers tend to apply templates in favor of ‘inverse optics’ algorithms (Koenderink, 2011).
The same factors give rise to huge errors in regular wide-angle photographs, as we have documented in the past (Koenderink, van Doorn et al., 2010).
Next we discuss these issues in more detail in terms of the log-polar model of visual space (Koenderink & van Doorn, 2008).
Discussion of Selected Quantitative Details
In this subsection we discuss a number of remarkable regularities in the data.
Azimuth pattern
One pictorial structure that might be thought to be obvious to all observers because immediately visually present is the pattern of azimuths. Even without fully parsing the stimulus the regular spacing of the major pictorial objects (the actors) is evident. One might expect that basic fact to be reflected in the responses. Perhaps remarkably, it is not, as is evident from Figure 11.
This might suggest that observers generally put the camera too close to the hexagonal configuration. This is indeed in accord with the generally wider than expected scopes.
One may speculate that it perhaps has to do with the limited drawing area. Of course, there is no way to check for that in the data, it would imply another (major) effort to address the issue empirically.
Depth ratios
The distributions of depth ratios as plotted in Figure 10 gives rise to some concern.
The depth ratios 1 are responded to veridically, with little spread. This is no more than expected. The depth ratios 2 are underestimated, although only by about 10%, also, not much reason for concern. Their spread is large.
The depth ratios
Such effects are surprising in view of the clear pictorial cues. As seen in Figure 15 the pictorial evidence leaves little doubt as to the depth ratios. Observers are expected to use such cues fully automatically. There is something going on here that is hard to put the finger on, at least on the basis of the present data. Again, it would imply another (major) effort to address the issue empirically.
These are cutouts from an equirectangular map, thus the vertical dimension is simply the elevation ɛ. For practical purposes, the height ratios in the Mercator projection are quit close. It is evident that we have ratios 1, 1.4 and 2.
Aspect ratios and scopes
The cases of the aspect ratios and the scopes are discussed together since they appear mutually related.
The apparent scopes appear rather larger than expected. The median corresponds to a focal length of about 14 mm on a 35-mm camera (the classical 24 × 36 mm Leica format), generally considered a fairly wide wide-angle (the shortest lens for the Leica is the 12 mm Heliar produced by Voigtlander). Of course, it is still fully located in the frontal half-space of the (imagined) camera.
From an earlier investigation (Koenderink et al., 2009) we know that the median ‘apparent field of view’ (perhaps the ‘diameter of the visual field’) is about a right angle (as Helmholtz reports for his own subjective feeling) with a very wide distribution from about
The fact that people are not comfortable with a limited drawing surface may have played a role, in that they perhaps place the camera closer to the configuration than they would have if given more drawing space. (Of course, being confronted with an essentially unlimited surface might intimidate them even more, there is no obvious way to handle such inhibitions.) One can only speculate.
Here we reconsider the issue of aspect ratios. In Figure 9 we measured aspect ratios in the drawing. It is probably more useful to consider these in the log-polar representation (see next section). So we compute ‘aspect ratio’ as measured in log-polar visual space. These aspect ratios also have a large spread, in this case with an interquartile range of almost 0.3 to 0.5. On theoretical grounds one expects a relation between the scope and the aspect ratio (see next section). This can be tested via a scatterplot involving all cases (Figure 16). This graph should be interpreted in terms of the log-polar model introduced later.
Scatterplot of the aspect ratios against the scopes (red points). The curve is the theoretical prediction, the points are aspect ratios as measured in the log-polar visual space model. Although the spread is large, as was to be expected, the prediction is evidently in the right ballpark. (Two cases of interior views left out.)
Apparent spatial attitudes
The nose directions with respect to the local camera-viewing-direction are immediately given as pictorial evidence. Actors are clearly seen either in anterior or posterior frontal attitude, or in left or right profile.
However, observers apparently distinguish between various directional frames. In this case they show systematic deviations from what appears to be pictorially given, as documented in Figure 12.
We believe such systematic effects to be due to the same factors we have studied in some detail earlier (Koenderink, van Doorn et al., 2010), so it may be said that they meet our expectations.
The Log-Polar Model of Visual Space
For simplicity we only consider the horizontal plane at eye-height, thus, if we say ‘visual space’ (usually a hemisphere augmented with depth), we limit the discussion to the horizon augmented with depth. What is ‘optically specified’ are just visual directions, the depths are added in the psychogenesis of visual awareness.
In a model of visual space, one has to address the issue of depth values. The best known example is ‘inverted optics’: one simply computes the depths from the optical data (Marr, 1982; Poggio, Torre, & Koch, 1985), if this does not work one guesstimates them (Knill & Richards, 1996). This appears reasonable enough, but what if none of this works to satisfaction? Is it possible to say something about the structure of visual space anyway? Here is a kind of poor man’s inference, based on Euclid’s Optics (Burton, 1945):
^ in the absence of specific knowledge, all visual directions are equivalent, none is preferred. Hence the structure of visual space should be invariant with respect to angular translations (rotations about the nadir-zenith axis). ^ in the absence of specific knowledge, there is no preferred ‘unit of distance’. Hence the structure of visual space should be invariant with respect to distance scalings (dilations or contractions about the view point).
These very basic invariances imply a minimal structure of visual space. It can be formalised in various ways. A very simple formalisation is the log-polar model (see Figure 17).
The ‘log-polar’ model of visual space. At left, a polar coordinate system in the Euclidean plane. The system of ‘rays’ is obviously invariant with respect to rotations about the origin. The radii of the system of circles concentric with the origin have been distributed such that the system is invariant with respect to uniform scalings about the origin. Thus, the whole system is invariant with respect to rotations and scalings. This models the structure of optical information as discussed by Euclid in his book on optics. At right, this configuration has been transformed by the log-polar map. One obtains a Cartesian grid, small regions of the grid are geometrically similar to the corresponding region in physical space. The map is conformal. Arbitrary translations of figures in log-polar space derive from optically equivalent figures in physical space. This model has been shown to account quite well for observer responses with very wide fields of view.
Use azimuth ϕ (angle from the anterior direction, ranging from
This model has many nice properties, for instance, it is conformal, thus has no deformations of local details.
As a first application we plot the map of the two regions plotted in Figure 1 right in visual space (see Figure 18). Now it is immediately obvious why the strange blue banana shaped region in Figure 1 is of interest: It is the unique ellipse that passes through the points of the stimulus hexagon in visual space. It is perhaps more difficult to see that the orange region is bounded by a closed curve too, but notice that the left and right vertical boundaries are actually the same visual direction. Remember that ‘the eye’ is at The configuration shown in Figure 1 right plotted in visual space according to the log-polar model. Now the strange blue banana shape is a perfect ellipse, whereas the orange ellipse is the orange area which extents all the way to 
Starting from this model picture, we may attempt a bold extrapolation: What if we scale the azimuth?
Scalings of the depth dimensions are reasonably well understood. We have worked on that for many years. Scalings of the azimuth dimension have – to the best of our knowledge – not been considered before, except in the (very special and limited) case of linear perspective (Pirenne, 1971). Scalings of the azimuth dimension are of immediate importance in the perception of photographs for which the scope is essentially unknown.
Simply scaling scope we obtain the interpretations shown in Figure 19. Of course, the crucial application is to scale from the space behind your back (or, equivalently, in front of the picture place) to the space in front of you (or, equivalently, behind the picture place). Here it is already quite clear from all kinds of data that visually, there is no space behind your back and that it is in bad taste, and almost impossible to pull off, to show objects in front of the picture plane.
5
Here are examples of equivalent configurations in physical space (left) and visual space (right). Left: The configuration at top left corresponds to the ‘true sized’ configuration in physical space (Figure 1 right). The other three cases involve scalings of the scope, by factors of 2, 4 and 8. Notice that for the factor 4, the shape is ‘almost’ convex, the change to convex happens at a factor of 
Thus it is extremely interesting to notice that the (trivially simple) log-polar model does quite a good job to post-dict the present data. The exterior view model fits the data quite well. In the aspect ratio plot (Figure 16), the aspect ratios (square root of ratio of eigenvalues of the covariance matrix of the locations in visual space) are evidently well fitted. Thus one may actually extrapolate from azimuths behind the camera to azimuths in front of the viewer (or behind the picture plane), as demonstrated in Figure 19. (The corresponding figures for the categorically different case of Observer 17 are Figure 1-right (physical space) and Figure 18 (visual space), the bluish-tinted cases.)
In visual space (Figure 19 right) the scaling transforms an ellipse into another ellipse, thus the plot of Figure 16, where the aspect ratios have been measured in visual space is very ‘natural’. The definition of ‘aspect ratio’ in physical space (although formally fine) seems more contrived.
In the past we have already shown that pictorial space allows for huge (anisotropic) scalings in the
We know of no other data addressing this issue. It is evidently of great interest to attempt to collect more though.
Conclusions
Participants were confronted with a postcard-like photograph, obviously of a natural scene, without any information concerning the scope of the field of view captured by the camera. This was intentional as it is closest to generic applications. Almost no newspaper, magazine or textbook prints the width of the field of view in the legends (often there is no legend to begin with).
Although an obvious fact, almost irrelevant to relate, this goes squarely against the grain of conventional understanding of correct depiction. The conventional theory of correct depiction is solidly based on linear perspective or, more generally, on the conservation of angular relations between visual directions (Pirenne, 1971). On the one hand, it is generally (but silently!) understood that pictures will ‘work’, no matter what, on the other hand, it is a universally agreed fact that visual directions should be conserved. There is obviously some uneasiness here, in that many ‘wrong’ depictions often still give rise to ‘normal’ impressions. To look ‘wrong’, one typically needs to violate topological relations, as pioneered by artists like Picasso in the early 20th century (Berger, 1989).
The familiar Ames room (Ittleson, 1952) still manages to induce wonder, yet it works because it conserves visual directions. Ames demonstrated this loud and clear, although the significance of his message tends to be underestimated. He demonstrated that pictures are infinitely ambiguous, something that few scientists are ready to hear even today. For instance, it implies that ‘inverse optics’ per se is impossible. But in viewing and (hopefully) understanding pictures, the public routinely goes way beyond Ames. The Ames demonstration is trivial, but, because of that, important.
The really interesting cases involve deformations of the bundle of visual rays (Burton, 1945).
In this study we move away from the Ames case in that we do not conserve visual directions. It is perhaps of some interest though, because it involves the space behind the back, or in front of the picture plane. Our empirical results at least suggest that a fairly simple model suffices to get at least the basics right. Why such a simple model would work is something we are not ready to comment on. We are not aware of any argument from contemporary brain science that might conceivably illuminate this matter.
An interesting aside is that people who are (or have been made) intellectually aware of the actual physical layout of the scene are generally (we met no exceptions) still unable to ‘see’ this in terms of their visual awareness. A different approach would have been to tell the participants up front what they will be confronted with. Perhaps, it would not be too hard to instruct participants so as to arrive at veridical judgements through a mixture of geometrical and logical reasoning. It does not work anyway. Even when people know what is there, they stubbornly see what they see at first blush. Awareness is not a cognitive judgement. Visual artists trust their eyes, ignoring their reflective thoughts.
Why the difficulties? We speculate that this has to do with the fact that there is no visual space behind one’s back (Phillips & Voshell, 2009). As a consequence, pictures are intuitively understood to only show part of what could be in front of one. Closely related to this is that the two actors seen in profile in the stimulus are adjacent to each other in the physical scene (see Figure 20, compare Figure 1), yet look to be at maximum separation in the picture. The shortest connection between them would have to cross the posterior meridian of the optic array, an area behind one’s back. In the picture, this connection would have to go by way of the left and right edges of the picture, which actually should be identified, again something that is not possible if everything is experienced as in front of the observer (Koenderink & van Doorn, 2017).
A picture of the scene after a 
An actual picture of everything in front of the observer only shows part of the configuration (Figure 3). The depicted part is seen as essentially veridical.
A more or less veridical view of the whole scene can perhaps only (we are not aware of alternatives) be pictured in a bird eye’s view of the scene (Figure 21). This is a Riemann normal coordinates map (also known as Postel’s projection (Flocon & Barre, 1968)) centred at the nadir. In this map, the full circular picture frame represents the zenith, that is a single point, thus may have some counterintuitive properties. Such renderings have recently become popular as ‘small planet’ images. Indeed, the horizon is a circle, and the full viewing sphere is mapped inside the disk.
A Postel map based on Riemann normal coordinates centred at the nadir. The whole circular circumference represents the zenith, a single point. This map is neither conformal nor area true. It is quite nice near the origin (the nadir) though, essentially up to the horizon. Distances from the nadir and angles at the nadir are perfectly represented. Notice that the tripod on which the camera was mounted is visible at the centre. The camera cannot see itself. It takes two 190° back-to-back fish-eye photographs and stitches these together in the camera. The output is an equirectangular representation of the full optic array.
The results from the analysis are clear enough. Virtually all participants are aware of a scene that fits the space in front of them, with the view point outside of the configuration. The relevant pictorial cues available to the observer are the horizontal separations, the heights in the picture plane, the relative heights of the actors and the apparent spatial attitudes with respect to the local visual direction (frontal, posterior or profile view, in the latter case facing left or right). The angular extent of the configuration is not indicated, thus it has to be assumed by the observer. It will necessarily be idiosyncratic. From previous experience with picture perception one may expect values to lie about the right angle, albeit with a huge spread. This is indeed what we find.
In any case, the results are clear evidence for the fact that the magic circle is almost impenetrable, only 1 of our 61 participants spontaneously succeeding. As we have shown in a related study, the same holds for the magic sphere – the viewing sphere (Koenderink & van Doorn, in press). For the latter case historical evidence for an intellectual awareness of this topo-agnosia existed for centuries (Goldstein & Hon, 2007; Stevenson, 1921). We are not aware of such evidence relating to the magical circle though.
By way of conclusion, it is easily possible to design configurations that will be interpreted by virtually all observers in some intended nonveridical manner. Since photographs (other than drawings or paintings) tend to be taken for optical truth, this may well be of interest to movie directors and illustrators. It may also serve as a warning against ill-considered use of wide panoramic images for applications in which a roughly veridical impression is desirable. Think of the real estate business, travel agencies and so forth. It is perhaps especially important in the court room (Carter, 2010; Sholik, 2015). Whether straight-out-of-the-camera panoramic images might be admitted as legal evidence should be a serious issue.
Footnotes
Acknowledgements
The authors thank Steven Vanmarcke, Charlotte Boeykens, Lore Goetschalckx and Axel Constant Pruvost for their gracious acting in the shooting of the configuration in the Leuven city park.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work was supported by the program by the Flemish Government (METH/14/02), awarded to Johan Wagemans. Jan Koenderink was supported by the Alexander von Humboldt Foundation.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
