Abstract
Objective
The aim was to evaluate the effectiveness of a Transport Canada Level 2 Instrument Proficiency Check Flight Training Device to elicit the disorientation caused by the black hole illusion. To evaluate the role of gender, we measured the relative susceptibility of men and women.
Background
Spatial disorientation is a well-known causative factor in aviation mishaps. However, there is no simulation-based training protocol for visual illusions that cause spatial disorientation.
Method
We simulated an approach-and-land scenario using an ALSIM simulator. Trainee pilots were instructed to maintain a 3° approach and land the aircraft under conditions with (nighttime) and without (daytime) the black hole illusion. We computed altitude errors by differencing the daytime and nighttime flight paths. Glideslope errors were calculated as deviations from the 3° approach. To assess the contribution of spatial abilities, participants completed a mental rotation test.
Results
Most pilots showed a shallow final approach during night flight relative to day flights. The pilots who experienced the illusion had lower mental rotation scores than those who did not. Men had higher mental rotation scores, on average, than women, and showed less negative altitude and glideslope errors in the night relative to day conditions. These errors were not mitigated by flight experience.
Conclusion
We reproduced the effects of the black hole illusion in a relatively low-cost aviation simulator. Gender and mental rotation skills were factors in black hole disorientation.
Application
It is feasible to implement simulated visual illusion scenarios in aviation training. It is important to consider gender in designing and assessing flight scenarios.
Introduction
Spatial disorientation is a known contributing and causative factor in military, commercial, and general aviation mishaps (Benson & Burchard, 1973; Gillingham & Previc, 1993). In this context, spatial disorientation refers to the pilot’s erroneous judgments regarding flight path, altitude, airspeed, vertical velocity, attitude, or general motion and position of the aircraft relative to Earth (Benson & Burchard, 1973; Gillingham, 1992; Gillingham & Previc, 1993). The severity and prevalence of spatial disorientation-related mishaps in aviation vary across countries, aircraft type (e.g., fixed or rotary-wing), and aircrew population (i.e., general aviation, air force, naval, or army). For example, in the United States, Class A mishaps (i.e., involving a fatality, permanent total disability, aircraft destruction, or damage exceeding a specified cost threshold) had prevalence rates related to spatial disorientation ranging from 5 to 15.3% (Bellenkes et al., 1992; Gibb & Olson, 2008; Kirkham et al., 1978; Lyons et al., 2006; Poisson & Miller, 2014). All fatal mishaps within this category reported rates that ranged from 2.5 to 26% (Collins & Dollar, 1996; Kirkham et al., 1978; Moser, 1969). In Canada, spatial disorientation was a factor in 22.5% of Category A accidents (i.e., accidents in which an aircraft is designated missing or destroyed) within the Canadian Forces from 1982 to 1992 (Cheung et al., 1995). These prevalence rates over multiple decades indicate that spatial disorientation is a significant, ongoing factor in aviation mishaps.
It is estimated that half to nearly all pilots experience at least one episode of spatial disorientation in their careers (Chimonas et al., 2002; Sipes & Lessard, 2000; Takada et al., 2009; Tu et al., 2021). The percentage of international aircrews that report an episode of severe spatial disorientation posing a risk to flight safety ranges from 4 to 44% (Davidson et al., 1991; Durnford, 1992; Holmes et al., 2003; Matthews et al., 2002; Pennings et al., 2020; Tu et al., 2021). When disoriented, pilots may rely on their vision to control aircraft trajectory rather than instrumentation, either by choice or by necessity (e.g., instruments are unavailable). Further, during visually guided flight, pilots are more susceptible to spatial distortions from visual illusions (Gillingham, 1992; Previc, 2004). Thus, to maintain a safe flight, pilots must recognize and avoid such disorientation. Unfortunately, opportunities to experience and learn to counter these illusions during training are limited. The aim of this project was to evaluate the effectiveness of a Transport Canada Level 2- Instrument Proficiency Check Flight Training Device to reproduce common aviation-relevant visual illusions. To do so, we simulated one of the most frequently reported visual illusions: the black hole illusion. Surveys suggest that 60–80% of aircrew have experienced this illusion in their careers (Holmes et al., 2003; Matthews et al., 2002; Pennings et al., 2020; Sipes & Lessard, 2000).
The black hole illusion occurs at night during the approach-and-land phase of flight. This illusion typically occurs when an approach is made over a featureless terrain and an unlit sky lacking a discernible horizon; in the worst-case scenario, only the runway lights are visible (Figure 1—left). The black hole illusion has been identified as a contributing factor in several well-documented accidents, including Pan Am Flight 806 (NTSB, 1989), Air Sunshine Cessna (NTSB, 1998), Korean Air Flight 801 (NTSB, 1997), FedEx Flight 1478 (NTSB, 2002), and Dassault Falcon 20 (FSF, 2004). Visual illusions (including black hole) play a role in a notable portion of controlled flight into terrain (CFIT) events (Kelly & Efthymiou, 2019), and the International Air Transport Association lists the black hole illusion as a continuing “environmental threat” in its annual Safety Report (IATA, 2021). Human vision is not well adapted for the spatial judgments required in night flights, relying heavily on contextual cues and horizon references (Roscoe, 1979). In the absence of ambient out the window visual cues, the only information available to land is the runway size and shape, which can lead pilots to misjudge their glide path and altitude. In this context, altitude refers to the aircraft’s height above the ground, and glideslope refers to the angle of descent toward the runway. Pilots tend to overestimate the aircraft’s altitude and initiate an aggressive early descent (Figure 1—right), which results in a shallow final approach, descending below the proper glide path at a lower-than-normal altitude (Kim et al., 2010; Previc, 2004). Typically, the low approach occurs less than 5 NM from the runway (Gibb, 2007). Black hole illusion assessments often use full-flight simulators with realistic motion (Teifer et al., 2023), experienced pilots (Kraft, 1978; Lewis & Mertens, 1979; Mertens & Lewis, 1982; Robinson et al., 2020) or nonpilots (Gibb et al., 2008) to study spatial disorientation. Our study was designed to evaluate if a simple, low-cost fixed-base simulator could reliably reproduce the disorientation caused by this illusion in trainee pilots. Image on the left shows an illustration of a black hole environment where only runway edge lights are visible. Image on the right shows an illustration of a shallow approach due to the black hole illusion; when initially on a normal glide path, the pilot misperceives the aircraft’s altitude as too high. To compensate, the pilot begins an aggressive descent, flying a low and potentially dangerous final approach (adapted from Figure 1 from Gibb (2007)).
Most research on visual illusions in aviation involves anecdotal, incident, or accident reports. Further, while extant experimental studies are more rigorous, many only include men or fail to specify gender (Bulkley et al., 2009; Gibb et al., 2008; Kim et al., 2010; Kraft, 1978; Lewis & Mertens, 1979; Mertens & Lewis, 1982; Robinson et al., 2020). This omission is notable given evidence that key aviation-related cognitive skills—such as spatial ability (Barron & Rose, 2013; Dror et al., 1993; Egan, 1978; Verde et al., 2018), particularly mental rotation ability (Verde et al., 2013)—often differ by gender (Boone & Hegarty, 2017; Kheloui et al., 2021; Voyer et al., 1995). Meta-analyses consistently demonstrate a male advantage in three-dimensional mental rotation tasks, moderated by task characteristics such as rotation angle and time limits (Voyer et al., 1995). These differences stem partly from variations in strategy use, confidence, and problem-solving approach (Boone & Hegarty, 2017). However recently, it has been reported that spatial ability differences may reflect an interplay of biological, sociocultural, and psychological factors rather than biological sex alone (Kheloui et al., 2021). Despite these well-documented gender differences in spatial perception, little research has examined whether mental rotation ability influences susceptibility to the black hole illusion directly. Research has focused primarily on misperceptions of slant and visual spatial orientation during landing approaches (Jakicic et al., 2022; Perrone, 1984), without explicitly examining the role of mental rotation ability. With growing initiatives to promote gender balance among the aviator population, the lack of data from underrepresented genders may have important implications for training and risk assessment.
Accordingly, the aim of this study was to (1) establish whether an aviation training simulator effectively generates the disorientation expected from the black hole illusion in trainee pilots, (2) evaluate the susceptibility of this illusion in men and women, and (3) assess whether individual differences in mental rotation ability are associated with susceptibility to the illusion. To test our first hypothesis (H1) that trainee pilots can effectively demonstrate the effect of the black hole illusion in a simulator, we evaluated their approach and landing performance during a nighttime simulated black hole scenario. Given that the shallow approach elicited by the black hole phenomenon occurs during the final stages of the approach, we evaluated pilots at two starting distances from the runway (3.5 and 5 NM). To quantify the illusion, we assessed the altitude and glideslope of the simulated aircraft relative to daytime approaches without the illusion.
Based on prior research linking spatial ability and flight performance, we hypothesized (H2) that limitations in mental rotation ability may contribute to the perceptual misjudgments underlying the black hole illusion. Specifically, we propose that the illusion may arise, in part, from challenges in mentally transforming spatial information during approach and landing. Furthermore, given established gender differences in mental rotation ability, we hypothesized (H3) that the female pilots may demonstrate a higher prevalence of the black hole illusion which could be explained by differences in mental rotation ability. Together, these hypotheses aim to clarify the mechanisms contributing to black hole illusion susceptibility and inform training approaches.
Methods
Participants
A total of 30 trainee pilots (men = 15 and women = 15) between the ages of 19–31 (M = 20.7, SD = 2.5) completed the flight scenarios. They were recruited from the student population in the University of Waterloo Science and Aviation program, the Waterloo Wellington Flight Centre, the University of Waterloo Aviation Society, and the University of Waterloo Aviation Alumni Group. All pilots held a Private Pilot License and a Transport Canada Medical Category 1 certificate. Their flight hours ranged from 52 to 335 (M = 168.7, SD = 92). Visual acuity was assessed before the experiment with a requirement of 20/20 with optical correction. If necessary, participants wore their optical correction during testing. Both biological sex and self-reported gender were collected from all participants, and in all cases, their biological sex was consistent their declared gender; all analyses are reported by gender only.
Apparatus
All testing was conducted using the ALSIM AL250 FSTD fixed-base Flight Training Device at the University of Waterloo’s Institute for Sustainable Aeronautics (WISA) facility (Figure 2). The device has a variety of weather, auditory and tactile settings, which create flexible environments for advanced aviation training and research. The simulator has a panoramic 250° by 49° high-definition display and a minimum frame rate of 60 frames per second. The display screen was approximately 85 cm from the eyepoint in the cockpit. The cockpit view of the ALSIM, showing controls and panoramic visual displays.
Procedure
Prior to testing, participants completed a consent form, demographic questionnaire, visual acuity test, and the Vandenberg and Kuse Mental Rotation Test-A (MRT-A) (Peters et al., 1995). We selected the MRT-A as our measure of spatial ability because the black hole illusion likely relies on object-centered 3D rotational transformations. The MRT-A specifically measures the speed and accuracy of such rotations and has been widely used in individual-differences research (Peters et al., 1995; Vandenberg & Kuse, 1978). The demographic questionnaire included questions regarding total flight hours, hours flying at night, and simulator hours. Following this, participants completed a 5-min practice landing under daylight conditions with full instruments available to become familiar with the simulated aircraft. For practice trials, we used a simulated approach to Pembroke airport (ICAO: CYTA, N 45° 51.87′ W 77° 15.09′, magnetic heading 352°).
To simulate the black hole illusion, we selected a location and weather conditions to generate a Night condition that included an unlit sky, featureless terrain, and an invisible horizon and moon, with only runway lights visible. To prevent pilots from learning the characteristics of the runway and landscape, all flight scenarios were completed in the Night condition before the Day condition. This also reduced the possibility of practice effects between the Night and Day conditions, since the visual information available in the Night condition was sparse. For all test conditions, we used a rural northern Canadian airport in Fort Severn, Ontario (ICAO: CYER, N 56° 01.14′ W 87° 40.57′, magnetic heading 150°). We required pilots to fly using vision alone, simulating an instrument failure that increases susceptibility to the illusion. Thus, all instruments, except the airspeed indicator and tachometer, were disabled and covered by a strip of black card. Before each trial, the trim was neutralized, and flaps were up. Participants were free to adjust the flaps while flying. The runway had no markings, and both the precision approach path indicator (PAPI) and approach lighting system (ALS) were turned off. After each flight, a MATLABTM script recorded data from the simulator. Pilots were instructed to maintain a 3° approach and land the aircraft. We started pilots on the ideal 3° glide path. Thus, the altitude for each Starting Distance (5 and 3.5 NM) was consistent with a 3° glideslope (1630 ft and 1150 ft above ground, respectively). Starting distances were counterbalanced, and each participant completed 4 trials in total (2 Starting Distances × 2 Time-of-Day conditions). Between each trial, we asked trainee pilots about their confidence and how difficult they found each approach. We included a debrief questionnaire after the flight scenarios, where we stated the purpose of the study and asked pilots for their feedback on task difficulty and flight strategies. The study took 1 hour to complete.
Analysis
The independent variables were Time-of-Day (Night or Day), Starting Distance (3.5 and 5 NM), Gender (men or women), MRT-A score, or Observer Group (“black hole” or “no black hole” group). The dependent variables were the altitude and glideslope deviation time series. Glideslope errors were calculated as the difference in approach angle relative to the 3° approach for the Night and Day conditions. The instantaneous approach angle was calculated as the arctangent of the ratio of recorded altitude to distance from the runway threshold. Given the lack of flight instruments and the trainee status of the participants, we expected considerable variability in their flight trajectories, particularly in the black hole environment. Altitude (i.e., height above ground) errors were defined as the signed difference in altitude between the Day and Night conditions (altitudeD-N). Positive altitude errors indicate that nighttime altitudes exceeded daytime altitudes, whereas negative errors indicate that nighttime altitudes were lower than daytime altitudes. For each analysis, we used individual linear mixed-effects models to determine the relationship between select predictor variable(s) (i.e., Time of Day, Starting Distance, MRT-A score, Gender, and Observer Group) on a single dependent variable (i.e., altitude at Night relative to Day (altitudeD-N) or glideslope error). Each model included a random intercept for each participant to account for individual differences and repeated within-participant measurements, and, where appropriate, random slopes were included to allow the effect of predictors to vary across observers, if doing so improved model fit. Repeated measures of altitudeD-N and glideslope error, collect across multiple Time-of-Day and Starting Distance trials, were nested within observers. Each linear mixed-effects model had a Kenward-Roger correction on degrees of freedom. Statistical significance was evaluated at p < .05 and the Holm’s correction for family-wise error was applied throughout the analyses (Holm, 1979). The models were fit by a restricted maximum likelihood (REML) procedure. We used the “lmer” function in the “lmerTest” package in R (Kuznetsova et al., 2017) to compute linear mixed-effects models. Partial eta-squared effect sizes for the linear mixed-effect models were computed using the “t_to_eta2” function from the “effectsize” (Ben-Shachar et al., 2020) package. Bayesian and other independent samples tests and corresponding effect sizes were calculated using JASP statistical software (JASP, 2024).
Results
As discussed above, the black hole disorientation results in a shallow final landing approach. We found that almost two-thirds of approaches in the Night condition followed a shallow path that was below the altitude of the ideal 3° glide path (Figure 3). Thus, our night approach-and-land scenario reliably recreated the black hole disorientation. A low approach was particularly evident less than 1 NM from the runway threshold (right plots in Figure 3). Although the LOESS (i.e., locally estimated scatterplot smoothing) fit in Figure 3 appears slightly above the ideal 3° glide path, this represents the average across all pilots. Given that some pilots flew below the 3° path while others exceeded it, the smoothed fit is on average slightly above the ideal path in the final approach. For purposes of analysis, pilots who demonstrated a negative mean altitudeD-N (i.e., a lower night altitude than day altitude) in the last 1 NM of their final approach were categorized as having experienced black hole disorientation. Individual flight trajectories for the 5 (top) and 3.5 NM (bottom) Starting Distances during the Day (control) and Night (black hole) conditions. The left plots show the flight trajectories for the entire flight path. The right plots show the flight trajectories for the last 1 NM of the flight. The start of the runway is equivalent to zero NM. The solid black line represents an LOESS fit. The single-dashed black line represents a 3° glideslope.
To measure black hole disorientation, we calculated mean altitudeD-N and glideslope error for the last 1 NM of the final approach (Figure 4). A linear mixed-effects random intercept model of altitudeD-N regressed onto Starting Distance revealed that there was no significant difference between 3.5 and 5 NM, b = −11.49, t (29.00) = −0.74, p
adj
= .46, η
p
2
= .02, CI
.95
= [−42.59, 19.61]. Similarly, to determine if the glideslope error differed between the Day and Night flights, we first compared the glideslope error between the two Starting Distances in each Time-of-Day condition. In another linear mixed-effects model with random slopes allowed for Time-of-Day, with glideslope error regressed onto Starting Distance and Time-of-Day, we found that glideslope error was similar between 3.5 and 5 NM from the runway in the Day, b = 0.04, t (58.00) = 0.21, p
adj
= 1.00, η
p
2
= .00, CI
.95
= [−0.34, 0.42], and the Night condition, b = −0.14, t (58.00) = −0.71, p
adj
= 1.00, η
p
2
= .00, CI
.95
= [−0.51, 0.24]. The left plot shows the average altitudeD-N (feet) for the 3.5 (yellow circles) and 5 NM (blue triangles) Starting Distances. In this plot, the y-axis represents the difference in altitude in the Night relative to the Day condition. The right plot shows the glideslope error for the Day and Night conditions at both Starting Distances. The glideslope error was calculated as the difference between the observed glideslope and the predicted 3° glide path. A positive glideslope is steeper than the predicted 3° glide path, and a negative glideslope is shallower than the ideal glide path. AltitudeD-N and glideslope error are averaged over the last 1 NM of flight. The boxplot represents the interquartile range, and the solid horizontal line represents the median. The horizontal dashed lines represent zero error.
Given that Starting Distance did not impact altitudeD-N or glideslope error, it was not included as a predictor in subsequent analyses. When excluding Starting Distance as a predictor, glideslope error was significantly more negative in the Night compared to Day performance, b = −0.98, t (29.00) = −2.81, p adj = .009, η p 2 = .21, CI .95 = [−1.68, −0.29], as can be seen in Figure 4 (right). This finding is consistent with the effects of the black hole illusion. We did the same analysis with altitude to confirm a shallow approach in the night conditions. As expected, we found that the differences between observed altitude and altitude corresponding to a 3-degree approach were more negative in the Night condition compared to the Day, b = −46.51, t (29.00) = −2.74, p adj = .01, η p 2 = .21, CI .95 = [−80.30, −12.72]. Although glideslope was significantly lower in the night condition compared to the day condition, the glideslope in the night condition was closer to the ideal 3-degree glideslope, with greater variability across observers than in the day condition.
To evaluate the factors related to experiencing the black hole illusion, we divided pilots based on whether they experienced the illusion. To do so, we placed pilots into two groups based on their altitudeD-N, averaged across the two start distances (Figure 5). Participants with negative altitudeD-N were placed in the “black hole” (BH) group (19 observers or 63.3%) and the remaining observers were placed in the “no black hole” (NBH) group as indicated by vertical brackets in Figure 5. To determine if the mental rotation ability influenced pilots’ susceptibility to the black hole illusion, we compared MRT-A scores between the BH and NBH groups (Figure 6—left). An independent samples t-test (with a Welch correction for unequal sample size) confirmed that pilots in the BH group had lower mental rotation scores than pilots in the NBH group on average, t (18.74) = −3.10, p = .006, CI
.95
= [−9.26, −1.79]; Hedges’ g = −1.16, CI
.95
= [−1.98, −0.32]. Further, a Bayesian independent samples t-test confirmed strong evidence supporting higher mental rotation scores for pilots that did not show the effect of the illusion, BF
10
= 11.90; median: −1.01, CI
.95
= [−1.85, −0.23]. The altitudeD-N averaged across the two Starting Distances. The vertical brackets represent BH (below zero ft) and the NBH (above zero ft) groups. The horizontal solid line indicates the median. The horizontal dashed lines show zero error. The left plot shows the MRT-A scores for individuals who showed the black hole effect (BH) and those who did not (NBH). The MRT-A score represents the number of correctly identified 3D rotations. The right plot shows the MRT-A scores as a function of gender. The horizontal lines represent the median.

To determine if the difference in mental rotation scores between the BH and NBH was partly due to gender differences, we compared MRT-A scores between men and women (Figure 6—right). An independent samples t-test showed that, on average, men had significantly higher MRT-A scores than women, t (28.00) = 2.81, p = .009, CI .95 = [1.30, 8.30]; Cohen’s d = 1.02, [0.25, 1.78]. A Bayesian independent samples t-test confirmed strong evidence that MRT-A scores were higher in men than women, BF 10 = 5.49, median = 0.84, CI .95 = [0.12, 1.62]. This is consistent with the literature that shows men tend to have higher mental rotation skills than women (Voyer et al., 1995).
Gender
Although men tended to have higher mental rotation scores, our analyses revealed that neither altitudeD-N nor glideslope error was influenced by the interaction between MRT-A score and gender (see Appendix A for full analysis). Focusing on gender alone, Figure 7 (left) showed that women had more negative altitudeD-N than men and this was confirmed with a linear mixed-effects random intercept model, b = −119.84, t (28.00) = −3.40, p
adj
= .004, η
p
2
= .29, CI
.95
= [−188.85, −50.83]. Second, we evaluated the relationship between gender, Time-of-Day (i.e., Day and Night flights), and glideslope error with another linear mixed-effects model (Figure 7- right). The interaction between gender and Time-of-Day was significant, F (1, 28) = 8.94, p
adj
= .01. However, glideslope error did not significantly differ between men and women in both the Day, b = 0.93, t (28.00) = 2.12, p
adj
= .22, η
p
2
= .14, CI
.95
= [0.07, 1.78], and Night conditions, b = −0.93, t (28.00) = −1.41, p
adj
= .51, η
p
2
= .07, CI
.95
= [-2.21, 0.36]. While men made similar errors under Day and Night conditions, b = −0.05, t (28.00) = −0.13, p
adj
= 1.00, ηp2 = .00, CI
.95
= [−0.91, 0.80], women showed significantly more negative glideslope error (−1.91°) in the Night compared to the Day condition, b = −1.91, t (28.00) = −4.35, p
adj
= .001, η
p
2
= .40, CI
.95
= [−2.76, −1.05]. Thus, women’s performance was more affected by the black hole scenario than men’s. Mean altitudeD-N (left) and glideslope error (right) for men (dark green) and women (orange) at Starting Distances of 3.5 (circle) and 5 NM (triangle). The horizontal solid lines indicate the median. The horizontal dashed lines show zero error.
Given that women demonstrated more negative altitudeD-N than men, we compared the proportion of men and women in the BH and NBH groups. Overall, there were 7 men and 12 women in the BH group (36.8 and 63.2%, respectively), and 8 men and 3 women in the NBH group (72.7 and 27.3 %, respectively). Thus, a larger proportion of women were susceptible to the effects of the black hole illusion. Lastly, we also confirmed that these differences were not due to differences in flight experience (Appendix B).
Mental Rotation
Given that NBH pilots tended to have higher mental rotation scores, we evaluated whether altitudeD-N and glideslope error depended on mental rotation skills and Observer Group (BH or NBH, Figures 8 and 9). A linear mixed-effects revealed the slope of altitudeD-N as a function of MRT-A score did not significantly differ between the BH and NBH pilots, b = −8.52, t (26.00) = −1.60, p
adj
= .26, η
p
2
= .09, CI
.95
= [−18.56, 1.53]. The slope between altitudeD-N and MRT-A score did not significantly differ from zero for BH, b = 1.26, t (26.00) = 0.36, p
adj
= .72, η
p
2
= .00, CI
.95
= [−5.27, 7.78], or NBH pilots, b = −7.26, t (26.00) = −1.79, p
adj
= .26, η
p
2
= .11, CI
.95
= [-14.90, 0.38]. Overall, altitudeD-N was not significantly related to MRT-A score, b = 7.58, t (28.00) = 1.98, p
adj
= .06, η
p
2
= .12, CI
.95
= [0.09, 15.07]. Mean altitudeD-N as a function of MRT-A score in the BH (green) and NBH (purple) groups at Starting Distances of 3.5 (circle) and 5 NM (triangle). The solid lines are linear regression lines with confidence intervals indicated by shaded areas. The horizontal dashed lines show zero error. Mean glideslope error as a function of MRT-A score for the BH (green) and NBH (purple) groups for the 3.5 (circle) and 5 NM (triangle) Starting Distances. The solid lines are linear regression lines with confidence intervals indicated by shaded areas. The horizontal dashed lines show zero error.

Analysis of Glideslope Error as a Function of MRT-A Score and Observer Group.
Given the absence of MRT-A effects on glideslope error for the BH and NBH groups, we investigated the relationship between MRT-A score and error independent of Observer Group. A mixed-effects model regressing glideslope error onto MRT-A score and Time-of-Day (with random slopes for Time-of-Day) showed that relationship between MRT-A score and glideslope error was similar between the Night and Day conditions, b = −0.12, t (28.00) = −1.80, p adj = .33, η p 2 = .10, CI .95 = [−0.25, 0.01]. Further, neither the Day, b = −0.02, t (28.00) = −0.39, p adj = .71, η p 2 = .00, CI .95 = [−0.11, 0.07], nor the Night, b = 0.10, t (28.00) = 1.58, p adj = .37, η p 2 = .08, CI .95 = [−0.02, 0.22], condition had a slope significantly different from zero. There was no relationship between MRT-A score and altitudeD-N when excluding Observer Group as a predictor, b = 7.58, t (28.00) = 1.98, p adj = .06, η p 2 = .12, CI .95 = [0.09, 15.07].
Pilots who experienced black hole disorientation scored lower on the MRT-A (mental rotation) test than those who did not, suggesting a link between spatial ability and susceptibility to the illusion. Since MRT-A scores differed by gender, with women scoring lower than men, and women also showed more negative altitudeD-N, we investigated whether mental rotation ability mediated the relationship between gender and altitudeD-N. To test this, we conducted a statistical causal mediation analysis (Appendix C). In brief, we found no support for the proposal that mental rotation mediates the relationship.
Discussion
We found that most trainee pilots produced a shallow glide path during the final approach, consistent with the spatial disorientation expected from the black hole illusion (Figure 3). Thus, our study confirmed that it is possible to reproduce the black hole disorientation in a fixed aviation training simulator (H1). This is an effective, low-cost solution for implementing simulated visual illusion scenarios in aviation training and research. Interestingly, in postflight interviews, some pilots stated they adopted strategies in their approaches (e.g., flying at higher altitudes) to counteract the illusion. Despite this, half of these pilots still demonstrated a shallow approach. Thus, just being aware of the illusion is insufficient to mitigate its impact. Additionally, neither total flight experience nor night flying experience was associated with performance in the black hole scenario (Appendix B). This further suggests that general flight experience (including night flights) does not help aviators avoid the effects of this illusion. From these results, we conclude that training should not only convey the effects of the illusion but also focus on the specific flight strategies necessary to counteract the spatial disorientation and land safely. For example, black hole illusion countermeasures and staged visual approach training (i.e., starting with pure nighttime runway conditions and gradually adding distractions) have been proposed to improve pilots’ spatial-judgment skills (Curtis et al., 2009; Patterson et al., 2021). Incorporating black hole illusion exercises in simulators could provide low-risk practice and help trainees build resistance to related visual errors in simulated real-world flight conditions.
Baseline performance did not differ between men and women; they had equivalent flight skills without the influence of the illusion. However, women’s performance was most affected by the illusion (H3), with more negative glideslope error in the Night compared to Day, and more negative altitudeD-N than men (Figure 7). Overall, men tended to have higher mental rotation scores than women (Figure 6—right). Therefore, we evaluated whether differences in mental rotation ability contributed to these gender differences in flight performance using a causal mediation analysis. However, we found mental rotation scores did not mediate the relationship between gender and negative altitudeD-N.
The fact that those who experienced black hole disorientation had lower mental rotation scores suggests that mental rotation skills play a role in performance in the black hole scenario to some degree (H2). This is understandable given that the shape and scale of the runway lights were the only information available to complete the task. To successfully achieve a 3° glideslope, pilots had to maintain the position and shape of the runway during the entire approach, which is akin to maintaining the orientation of a slanted plane. Short-term visuospatial training could improve scores on such standardized spatial tests (Harris et al., 2013; Rehfeld, 2006). If exposure to the black hole illusion trains the same transformations measured by the MRT-A, it could potentially improve MRT scores. Since we measured MRT-A only at baseline, we cannot assess whether experience with the illusion causally affects mental rotation ability; however, this is an avenue for future research.
In postflight interviews, most pilots reported that they found the black hole scenario difficult and felt uncertain about their altitude, which may have elicited more random visual scanning behavior (Allsop & Gray, 2014), impairments in decision making (Causse et al., 2011) or led to increased cognitive load during landing (Li & Lajoie, 2021). In the face of this additional uncertainty, participants may have relied more on mental rotation-based strategies. However, this remains speculative and future studies are needed to evaluate the causal nature of this relationship.
Overall, we have shown that low-cost aviation simulators can be used to elicit the spatial disorientation associated with the black hole illusion in trainee pilots. Given the often-catastrophic consequences of spatial disorientation, exposure to these potentially dangerous scenarios in a safe, controlled environment is invaluable, especially when opportunities to learn to counter visual illusions during training are limited. Further, the influence of mental rotation skills and gender on flight performance should be considered when designing and assessing simulated flight scenarios during training. Although the black hole illusion is specific to aviation, the underlying mechanisms (e.g., spatial perception, mental rotation, and reliance on visual cues) reflect general principles of human perception. Similar perceptual challenges may occur in other operational settings, such as driving, maritime navigation, and remote vehicle operation, suggesting that these findings may inform broader human factors approaches to training and spatial performance under degraded visual conditions.
Key Points
A relatively low-cost aviation simulator used in pilot training is effective in eliciting the common black hole visual illusion Training should include strategies to counteract the illusion Women trainee pilots were more susceptible to the illusion than men Mental rotation skills play a role in mitigating black hole disorientation
Footnotes
Acknowledgments
Special thanks to Kamal Ben and Rafael Pastorin Repato for their feedback on our flight scenarios and management of the flight simulator, Carolyn Machan and Allison Lynch for coordinating participant recruitment, and the Waterloo Institute for Sustainable Aeronautics (WISA) team for their collaboration and access to their training and research facility.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by the Canada First Research Excellence Fund (CFREF): Vision Sciences to Applications (VISTA).
