Abstract
A laboratory experiment was conducted to examine the impact of different lighting conditions on melatonin derived from saliva samples, alertness as measured through reaction time (RT) to an auditory stimulus and self-reported sleepiness. This experiment replicated previous work but with the inclusion of an extreme condition to test the null findings of that previous work. There were four lighting conditions as defined by illuminance at eye level and spectral power distribution. Three conditions, having photopic illuminances of 0.5 lx to 8 lx (melanopic equivalent daylight illuminance (EDI) values of 0.5 lx to 10.4 lx) repeated the range used in previous work: the fourth condition extended this to 83 lx (melanopic EDI approximately 100 lx), which is extreme compared to those conditions typical of road lighting. The time period over which measurements were conducted was intended to represent pedestrian activity in the evening. The results revealed a significant reduction in RT and significant decreases in melatonin and subjective sleepiness only with the extreme condition, but did not suggest that lighting conditions typically used for road lighting had a significant effect on any of the dependent variables.
1. Introduction
This study focuses on road lighting for pedestrians. According to the CIE 1 the three primary objectives of road lighting are (1) to allow all road users, including operators of motor vehicles, motor cycles, pedal cycles and animal drawn vehicles to proceed safely; (2) to allow pedestrians to see hazards, orientate themselves, recognise other pedestrians and give them a sense of security; and (3) to improve the day-time and night-time appearance of the environment. These purposes primarily relate to aspects of visual performance and visual perception, the image-forming aspects of vision. However, in recent years, there has been an increasing emphasis on investigating the non-image-forming (NIF) impacts of lighting.2,3 One NIF affect is alertness, also referred to as vigilance, arousal or sustained attention. 4
Alertness refers to the activation level of the cerebral cortex and impacts the ability to process information. A decrease in alertness directly correlates with reduced performance in tasks that demand attention over a prolonged period. 5 Ensuring sufficient alertness is of importance for pedestrians, as impaired alertness (of pedestrians and/or other road users) has been linked to an elevated risk of tripping or falling incidents 6 and an increased likelihood of being involved in road traffic collisions. 7 Alertness follows a 24-h circadian rhythm, being typically higher in the morning and lower in the evening. 8 Light can influence alertness both indirectly, by modulating circadian rhythms, and directly, through acute effects. 2
To measure the effect of lighting on alertness requires definition of the variable(s) associated with that effect. Prior to the discovery of NIF photoreception pathways, studies investigating the NIF effects of light commonly quantified exposure using photopic illuminance.9,10 Photopic illuminance characterises light based on its effectiveness to support visual performance, primarily reflecting the spectral sensitivity of cone photoreceptors. However, this measure alone is insufficient for evaluating NIF effects as it does not account for the spectral sensitivity of other photoreceptor systems which feed into NIF responses.11,12
The discovery of intrinsically photosensitive retinal ganglion cells (ipRGCs), which contain the photopigment melanopsin, marked a significant advancement in understanding the NIF effects of light. The ipRGCs can influence NIF responses including alertness.13,14 Optimising lighting conditions to effectively stimulate ipRGCs has the potential to enhance alertness. 15 Such lighting conditions are enhanced levels of light in the short-wavelength region.16,17 However, while ipRGCs play a central role in mediating NIF responses, increasing evidence suggests that rods and cones also contribute to circadian photoreception, particularly under conditions of low irradiance or short-duration light exposure.2,11,12,18–20 In recognition of the complex contributions of multiple photoreceptors, Lucas et al. 12 introduced five spectral sensitivity functions, corresponding to each of the five known photoreceptor classes and their respective photopigments. Among these, the melanopic action spectrum has become the most widely used in research, 11 due to its strong relevance to ipRGC-mediated NIF responses. The effectiveness of lighting at stimulating the ipRGCs can be modelled using melanopic equivalent daylight illuminance (EDI) in lux. 11 Brown et al. 21 suggest that ‘alerting effects produced by light of varying spectral composition are certainly better predicted by melanopic irradiance than other available metrics’, and this proposal is supported by the CIE. 22
We therefore use photopic illuminance along with the alpha-opic equivalent daylight (D65) illuminances, including melanopic EDI, to characterise the influence of lighting on alertness.
Previous studies23–25 have demonstrated that lighting with higher levels of short-wavelength content can enhance alertness, as assessed using measures including self-report of sleepiness or reaction time (RT). These controlled laboratory studies,23–25 conducted at night, also measured melatonin levels. While performance on an RT test directly reflects the level of alertness, and self-report evaluates perceived alertness, melatonin is an indirect marker of circadian alertness status, related to the regulation of the sleep–wake cycle.26,27 In addition to those studies showing that melatonin suppression is associated with a faster RT, a similar effect is noted in studies measuring electroencephalography rather than RT,28–31 and the ingestion of melatonin in oral form impairs task performance.32,33 We note, however, that suppression of melatonin by light targeting the melanopic system does not automatically translate to acutely altered levels of vigilance or sleepiness. 34 This means that even if light suppresses melatonin, it would not necessarily result in shorter RTs or a reduction in subjective sleepiness.
One limitation of the previous studies2,23–25,35–37 on nocturnal NIF effects of light is that they involved a protocol where participants initially are adapted to dim lighting followed by exposure to relatively brighter conditions. In contrast, a typical pedestrian context is that individuals are first exposed to relatively bright lighting in office or home settings, followed by exposure to relatively dim road lighting.
Gibbons and Bhagavathula conducted two studies that better represented the typical scenario of evening exposure to road lighting.38,39 The range of lighting conditions used is shown in Table 1. In the experiment reported by Bhagavathula et al., 38 participants were instructed to drive on a closed loop road for 2 h (01.00 to 03.00) after a 2-h adaptation phase (23.00 to 01.00) under normal indoor lighting conditions (photopic illuminance 200 lx, melanopic EDI 87 lx, at eye level). The test track was illuminated with five different lighting conditions, similar to those commonly used in road lighting: correlated colour temperatures (CCTs) of 2100 K and 4000 K, and luminances of 0.7 cd m−2, 1.0 cd m−2 and 1.5 cd m−2, these combinations giving melanopic EDI ranging approximately from 0.3 lx to 0.8 lx.
Photopic illuminance and melanopic EDI (in lx) used in previous studies which used test conditions resembling pedestrian experience in the evening
From the study by Gibbons et al., 39 we refer here to that part where participants were seated on chairs on a closed road for 4 h (22.00 to 02.00) following a 2-h adaptation phase (20.00 to 22.00). The adaptation phase used the same lighting condition as did the adaptation phase in Bhagavathula et al. 38 For the test phase, there were six different lighting conditions, with varying CCTs (ranging from 2100 K to 5000 K) at a luminance of 1.0 cd m−2. The melanopic EDIs of these conditions ranged from approximately 1.5 lx to 5.7 lx.
In these two studies,38,39 alertness was measured using subjective reports of sleepiness and performance on a visual RT test, along with melatonin from saliva samples. In neither study was the change in lighting suggested to have significant effect on any of these measures.
In a previous experiment, 40 we tested the null findings of Gibbons and Bhagavathula by extending the upper level of melanopic EDI to 10.7 lx, which is above the highest levels of 0.8 lx and 5.7 lx used in Bhagavathula et al. 38 and Gibbons et al., 39 respectively. This upper limit was selected because a melanopic EDI of 10 lx is the maximum recommended for unavoidable activities for 3 h (at least) before bedtime to avoid melatonin suppression which would affect sleep quality 21 : it is therefore the maximum that might be considered acceptable for practical application. This was a laboratory study conducted over a 3-h period in the evening (21.00 to 00.00). The first 2 h consisted of participants adapting to lighting that resembled a typical indoor lighting, with vertical photopic illuminance at the eye of 25 lx, a melanopic EDI of 10.7 lx and a CCT of 2700 K. After the adaptation phase, participants were then subjected to one of four light conditions characterised by variations in melanopic EDI and CCT: <0.5 lx, 3.4 lx and 10.7 lx with a CCT of 2700 K, and 10.4 lx with a CCT of 5800 K. The dependent variables included were participants’ reaction to an auditory detection task, melatonin levels determined from saliva samples, self-reported levels of sleepiness and skin temperature. Similar to the studies of Bhagavathula et al. 38 and Gibbons et al., 39 the findings of this experiment 40 did not reveal any significant differences between the four lighting conditions.
We suggest two reasons why none of these previous studies38–40 revealed an effect of lighting on measures of alertness or melatonin. Firstly, the lighting conditions used may have been insufficient to reveal an effect. Instead of using conditions that are suitable for real-world application, it would be useful to use an extreme value sufficient to reveal an effect, 41 thereby demonstrating the ability of the experiments to reveal an effect if such an effect exists. In other words, using an extreme value in the experiment would enable confirmation that the previous findings of no effect due to lighting were not a result of inappropriate experimental design. Extreme here means a magnitude higher than that which would be conventionally used: such a value would not be used in practical application but is used only in the experiment induce an effect to demonstrate that the experiment is capable of revealing an effect.
The second reason is related to the level of physical activity. Higher levels of physical activity induce a dual-task detriment, placing additional demands on attentional resources than lower levels of activity. 42 Different level of activity might therefore affect participants’ responses in an RT task.
In the pedestrian-focused studies of Gibbons et al. 39 and Alshdaifat et al., 40 test participants were either seated or walking slowly. In the Alshdaifat et al. study, after being seated for the adaptation phase, participants were either seated or walking slowly for the test phase, and the results did not suggest a significant difference in alertness between the walking and seated participants. The walking speed was self-selected by participants, and these speeds, ranging from 1.2 km h−1 to 2 km h−1, are slower than the typical walking speed of 4.5 km h−1.43,44 Walking instead at a normal pace demands greater cognitive attention to maintain balance,45,46 and if this demands sufficient cognitive resource it may lead to an effect on alertness.
Previous studies43,44 suggest the typical walking speed for adults aged 20 years to 29 years is around 4.5 km h−1 to 5.2 km h−1 while the median walking speed for individuals aged 17 years to 65 years is around 4.5 km h−1. A walking speed in the range of 4.0 km h−1 to 6.8 km h−1 is considered as moderate-intensity exercise according to the Compendium of Physical Activities (activity code 1717). 47 Moderate-intensity exercise is defined as that evoking a heart rate which is 50% to 70% of the maximum heart rate (HRmax), 48 where HRmax is defined as the participant’s age in years subtracted from 220. 49 Thus, heart rate provides a measure of the participant’s level of physical activity in an experiment.
We report here the findings from a second laboratory experiment conducted with the aim of revealing an effect of lighting on alertness and melatonin. This repeated the experiment of Alshdaifat et al. 40 but with two changes: it included a lighting condition providing much higher levels of photopic illuminance and melanopic EDI, and walking speed during the test lighting phase of the experiment was increased to deliver a heart rate of between 50% and 70% of each participant’s maximum heart rate.
2. Method
The effects of change in light level and light spectrum on alertness were investigated in a laboratory study in which the lighting conditions and test participant activity were selected to resemble walking after dark.
2.1 Apparatus
The experiment was conducted in a laboratory (Figure 1), a room of dimensions of 3.45 m length, 2.43 m width and 2.8 m height. The walls were white, with a reflectance of approximately 0.81. During trials the room lighting was switched off and windows were screened to block external light: the experiment was conducted from 21st October to 5th December 2022 from 20.00 onwards, which is after the end of civil twilight, so there was no daylight. The test environment was lit using a pair of LED arrays (THOUSLITE LEDCube-I14 (R27)), these having multiple different primary sources allowing the spectral power distribution (SPD) to be finely tuned. The context of this experiment was a person seated at home for 2 h (the adaptation phase) followed by a 1-h test phase representing a walk outdoors. In each test session, there were two test participants. Both remained seated for the adaptation phase and for the 1-h test phase, both started walking upon a treadmill to resemble the physical exertion and balance control of a pedestrian.

Plan layout of the test environment
2.2 Independent variables
There was one independent variable, the lighting condition, for which there were four levels defined by variations in illuminance and SPD. Table 2 and Figure 2 show the lighting conditions. The photopic illuminances were recorded vertically at a height of 1.5 m above the floor, in the direction of the participant’s view. Participants were seated during the adaptation phase, and the height of these seats were adjustable, ensuring that participants’ eye heights remained around 1.5 m above the floor level. The lighting condition used for the adaptation phase was intended to represent a typical residential environment.
Light conditions (illuminance and SPD-derived metrics) used in the adaptation and test phases of the experiment
Vertical photopic illuminance at eye level (1.5 m above the floor).
Alpha-opic and melanopic equivalent daylight illuminances calculated using luox from Spitschan et al. 50

SPDs of the four light conditions, shown in absolute units (top) and normalised to a peak response of unity (bottom) (top: Note the curve for light condition L1 is close to 0.0 W m−2 nm−1 at all wavelengths, bottom: Note that the curves for light conditions L1 and L2 overlap)
After a 2-h adaptation phase, the participants were exposed to one of four test conditions for a duration of 1 h (the test phase). The first lighting condition (L1) provided a vertical photopic illuminance <0.5 lx (melanopic EDI <0.5 lx) at the eye and the same SPD as in the adaptation phase. Measurements of vertical illuminance on a small sample of minor roads revealed a range of <0.5 lx to 20 lx, which suggests that L1 represent the lower end of the P-class.
The SPD for L2 was the same as that used during the adaptation phase, but with the vertical photopic illuminance set at 8 lx (melanopic EDI 3.4 lx). The third test condition, L3, utilised the same illuminance as L2, but the SPD was varied (i.e. the CCT was increased from 2700 K to 5800 K) to increase the melanopic EDI from 3.4 lx to 10.4 lx.
The fourth test condition, L4, used the same SPD as L3, but increased the photopic illuminance to 83 lx (melanopic EDI = 98.8 lx). This is the extreme condition as suggested by Veitch et al. 41 and was chosen for two reasons. Firstly, it offers an increase in illuminance (whether photopic or melanopic EDI) of one log unit above that provided under L3. The 10 lx melanopic EDI of L3 is the recommended 21 limit for unavoidable activities before bedtime: it is therefore the maximum that might be considered acceptable for practical application. Secondly, previous research by Nowozin et al. 51 has shown that a melanopic illuminance of 100 lx (a melanopic EDI of approximately 91 lx) is the threshold at which melatonin secretion becomes suppressed after 30 min of light exposure in the evening, regardless of the differential effects from prior light history and physical activity during the day of the experiment. We describe condition L4 as extreme because it is much higher than that used for road lighting: we do not intend to suggest it is considered for road lighting but used it to test the null finding39,40 found using conditions which were representative of those used for road lighting.
2.3 Dependent variables
Two variables were measured to indicate the impact of changes in lighting on alertness: RT to an audible stimulus and self-reported sleepiness. Saliva samples were collected to determine the melatonin level.
Saliva samples were collected using salivettes at intervals of approximately 30 min to 50 min during both the adaptation phase and the test phase (see Figure 3). Participants were instructed to chew on a cotton swab for a duration of 1 min to 2 min and then place it into a tube. The samples were initially stored locally at −20 °C, and then transferred at weekly intervals to the University’s biorepository where they were stored at −80 °C. Following the completion of all trials, the samples were packaged with dry ice to reduce degradation and transported to the Chrono@Work lab at the University of Groningen in the Netherlands for analysis using radioimmunoassay.52,53

Overview of the test procedure
Alertness was assessed through an auditory psychomotor vigilance test (PVT), which measures the time taken to react to the onset of an auditory stimulus by pressing a response button placed upon the desk. The stimulus was a 1000 Hz tone, delivered through headphones. The stimulus was played for 0.5 s at inter-stimulus intervals randomly chosen from the range of 2 s to 6 s. In order to maximise differences in RT between the various experimental conditions, the loudness of the tone was set to be near the audibility threshold of each participant, as established in the preparation time (see Section 2.4). Test participants attended in pairs, and each participant received a personally randomised stimulus pattern to prevent his/her reaction (pressing the button) from serving as a cue for the other participant.
Alertness was assessed by subjective evaluation using the Karolinska Sleepiness Scale (KSS). 54 This is a 9-point rating scale, ranging from 1 (very sleepy) to 9 (extremely alert). Note that in this work the KSS scale is reversed compared with its conventional use 54 so that a higher rating means a greater level of perceived alertness. The participants were asked to report their level of sleepiness at every interval during the 3-h experiment (see Figure 3).
We did not retain measurement of skin temperature as a variable in this experiment following Cajochen et al. 16 who concluded it has similar sensitivity at different wavelengths.
2.4 Procedure
The participants arrived at the laboratory at least 45 min before start of the adaptation phase to allow for preparation. The preparation was carried out under the same lighting as then used in the adaptation phase. The adaptation phase started at 21.00; this time chosen because it was around 3 h prior to the habitual bedtime of the recruited participants. The participants wore their normal clothing and were advised to bring paper-based reading material to occupy themselves in the time between tests.
Two examinations were conducted during the adaptation phase to confirm normal vision. A Landolt C chart was used to check visual acuity, ensuring an acuity of not less than 6/12 with their normal corrective lenses, this threshold being chosen because it is the minimum visual acuity required for driving in the United Kingdom. 55 Colour vision was evaluated using the Ishihara colour plates illuminated by a D65-simulating source.
The speed of the treadmill, used by participants during the test phase, was adjusted so that their heart rate reached the lower bound of target range (50% ± 5% HRmax). The treadmill gradient was set to 0%, representing a horizontal surface. Heart rate was recorded continuously throughout the experiment session at 1 s intervals using a Polar Vantage M2 Smartwatch, the validity of which was confirmed in previous work.56,57 Subsequent analyses of these data confirmed that participants heart rates were between 50% and 70% of HRmax, thus confirming moderate-intensity exercise throughout the test phase.
To establish heart rate for each participant, the treadmill speed was set to 2.5 km h−1 for 2 min, a period sufficient to reach target exercise intensity. 58 Following the protocol of Soga et al. 59 when the HR of the participants did not reach the target range, the treadmill speed was increased each minute in intervals of 0.5 km h−1. If the HR exceeded the target range, the speed of the treadmill was decreased each minute by 0.1 km h−1. Participants tended to reach the target HR range after about 6 min of walking. After doing so, participants were seated for at least 20 min before the adaptation phase started to allow enough time for them to rest. 60
To determine the hearing threshold of each participant, a range of tones of different loudness were played in random order, through headphones, to which participants were instructed to press a button when they heard a tone. The threshold level for hearing was determined by identifying the loudness level that corresponded with a 50% detection rate. Tone volume for the PVT test was set to that individual’s estimated hearing threshold plus an additional 10 dB, resulting in a perceived loudness twice as loud as the original tone. 61 The hearing threshold was determined twice, while the participants were seated and while they were walking at the determined walking speed, and the respective threshold used in subsequent trials.
The adaptation and test phases of the experiment lasted for 3 h. During this period, the dependent variables (saliva samples, PVT and KSS) were recorded at intervals of approximately 30 min, with measurements centred on minutes 5, 30, 60, 90 and 110 in the adaptation phase and minutes 130, 150 and 180 in the test phase. For minutes 30 and 90 in the adaptation phase, only the KSS was recorded.
The PVT test at each interval consisted of two blocks of 3 min each. The first block was conducted immediately prior to, and the second immediately after, the interval point at which KSS and salvia samples were taken. The combined results from both PVT tests were analysed as one block of 6 min, having responses to approximately 60 stimuli. The PVT data were cleaned by omitting assumed errors of omission (RT greater than twice the participant’s median RT) and errors of assumed commission (RT <100 ms). The cleaned data displayed a non-normal distribution; thus, the RT for each test interval was characterised using the median of responses to the 60 stimuli. This reduced the RT data to 40 responses (one per participant) at each of the six test intervals.
After the adaptation phase, the light condition changed to one of the four test settings (Table 2), and both participants changed from being seated to walking on the treadmill. The treadmill was set to the walking speed that was established in the preparation time, and they walked for the whole hour at that same speed, including whilst giving saliva samples and performing the PVT test.
2.5 Sample
This study recruited participants through emails posted to volunteer recruitment lists of university staff and students, with the following inclusion criteria: aged 18 years to 30 years, healthy (assessed by self-report of no short or long-term medication use, non-smoking and no history of health issues), a habitual bedtime before or at midnight, no recent overnight work (for the preceding one-year period) or travel over a time zone in the last three months. None of the participants had taken part in our earlier study, 40 ensuring that all were new to the experiment and avoiding potential bias or previous learning effects. Forty participants were recruited, with ten (five males and five females) allocated to each of the four test conditions. Their median age was 21 years, ranging from 18 years to 29 years.
Participants were asked to keep a steady sleep–wake schedule for the seven days prior to the experiment. A daily email was sent to remind participants to maintain the sleep–wake schedule and this was confirmed through a self-reported sleep–wake diary for that period. To avoid a possible influence on the melatonin analysis, on the day of their experiment participants were asked to not eat bananas or chocolate during the day, nor take any medication, to avoid consuming substances after midday which contain alcohol or caffeine and to refrain from napping. During the experiment, orange juice, nuts and water were provided for participants as refreshment. Upon finishing the experiment, participants received remuneration of £40.
Ethical approval for this experiment was received from the University of Sheffield Research Ethics Committee on 21 September 2022 (reference number 042711). In accordance with this, informed consent was obtained from all test participants and all recorded data were anonymised.
3. Results
3.1 Data normality
The data gathered for each dependent variable (percentage melatonin suppression, RT and KSS) were tested to determine whether they were drawn from populations with a normal distribution. This was done using four methods of analysis: measures of dispersion (skewness and kurtosis), statistical tests (Shapiro–Wilks and Kolmogorov–Smirnov tests), comparing measures of central tendency, and graphical representations (histogram and box plot). The results did not suggest that any of the dependent variables were normally distributed and thus analyses were conducted using non-parametric tests.
3.2 Psychomotor vigilance test
Figure 4 shows the median RT at the six test intervals where this was measured. Over the 3-h experiment, the median RT progressively decreased, suggested by the Friedman test to be a significant change across the intervals (p < 0.0001). A subsequent series of pairwise comparisons was conducted using post hoc Wilcoxon tests. To limit the risk of Type I and Type II errors, post hoc pairwise comparisons were corrected using Holm–Bonferroni. 62 This indicated a longer RT at the first measurement interval (5 min; median RT = 379 ms) than at the other test intervals (p < 0.05 in each case). The RT at the final test interval (180 min; median RT = 329 ms) was significantly shorter than at 150 min and 130 min.

Median RTs at each test interval as measured using the acoustic PVT. Error bars show the interquartile range (IQR): * shading distinguishes between the adaptation and test phases
For the adaptation phase, all participants were exposed to the same lighting condition; thus, no differences were expected between the groups subsequently allocated to the four test lighting conditions. Comparing the differences between the groups at each interval in the adaptation phase (5 min, 60 min and 110 min) therefore tests whether participants were fairly assigned to each group. The Kruskal–Wallis test did not suggest any differences to be significant (p ≥ 0.35 in each case).
Figure 5 shows the median RT under each light condition for the three intervals of the test phase. The aim is to determine whether RT (and similarly melatonin and KSS score) in the test phase differed from that in the adaptation phase according to the different lighting conditions. In some studies,63,64 there is only one measure of the dependent variable to characterise each of the adaptation and the test phases, for example, Brainard et al. 64 measured melatonin at the end of their 2 h adaptation phase and the end of the 90 min test phase. In the current work, the dependent variables were measured at several intervals within each phase, to allow the change to be monitored, for example, to show that the expected increase in melatonin level was revealed. For the RT analysis, we omitted data from the first adaptation interval (5 min) to offset the apparent learning effect and used the average of the final two adaptation intervals (60 min and 110 min) which were not suggested to be significantly different. Melatonin levels and KSS scores changed significantly as expected during the adaptation phase: a progressive increase in melatonin and a progressive decrease in KSS. Following previous work, we included only measurements from the final adaptation level.35,65,66

Median RTs at the adaptation phase (average of RTs at 60 min and 110 min) and at each test interval during the test phase according to the light condition during the test phase. Error bars show the IQR. Lighting conditions defined here by melanopic EDI because the photopic illuminances for L2 and L3 are identical (8 lx)
For each light condition, the change in RT over successive intervals was tested using the Friedman test. For light conditions L1, L2 and L3, the Friedman test did not suggest any differences in RT between the measurement intervals to be significant (p > 0.126; in each case). In other words, the transition from lighting in the adaptation phase to lighting in the test phase did not change RT significantly under light conditions L1, L2 or L3.
For lighting condition L4, the Friedman test suggested a significant effect (p = 0.001). Pairwise comparisons using the Wilcoxon test suggested that RTs at the third (180 min) interval in the test phase were significantly shorter than the average RT in the adaptation phase (p < 0.05). Also, the RT at 180 min was significantly shorter than those at 130 min and 150 min (p < 0.05), and the RT at 150 min was significantly shorter than that at 130 min (p < 0.05). In other words, under L4 the RT tended to decrease with time through the test phase, whereas for L1, L2 and L3 there was no change.
The Friedman test compares changes in responses within each light condition group across the measurement intervals in the test phase, a within-subjects analysis. An alternative approach is to compare responses between the different groups at the same measurement interval, a between-subjects analysis. This was done using the Kruskal–Wallis test: no differences were suggested to be significant (p ≥ 0.31 in each case).
3.3 Melatonin
Figure 6 shows the median melatonin levels of the 40 participants at each interval. The melatonin levels progressively increased as the measurement interval approached habitual bedtimes. The Friedman test indicated a statistically significant difference in melatonin levels with time (p < 0.0001). Pairwise tests using Wilcoxon suggested that differences in melatonin between all the intervals were significant (p < 0.05).

Median melatonin levels derived from saliva samples collected at each test interval. Error bars show the IQR: * shading distinguishes between the adaptation and test phases
For the adaptation phase, the Kruskal–Wallis test was used to determine group differences at the same interval. There were no significant differences (p ≥ 0.73 in each case), which suggests a fair distribution of participants across the four lighting conditions.
Following previous studies35,65,66 the effect of light on melatonin suppression at intervals in the test phase was analysed by calculating the percentage change in melatonin levels relative to the final interval of the adaptation phase. This is shown in Figure 7.

Median percentage melatonin relative to the last interval in the adaptation phase (110 min) at each interval during the test phase for each light condition. Error bars show the IQR. Lighting conditions defined here by melanopic EDI because the photopic illuminances for L2 and L3 are identical (8 lx)
Analyses were conducted using the Friedman test. For light conditions L1, L2 and L3, the Friedman test suggested significant changes across the measurement interval (p < 0.005 in each case). Pairwise comparisons using the Wilcoxon test indicated that there were significant increases in melatonin under lighting conditions L1, L2 and L3 (p < 0.05) with the highest percentage reached at 180 min.
For light condition L4, the Friedman test again suggested a significant change in percentage melatonin suppression across the measurement intervals (p = 0.05). Pairwise comparisons using the Wilcoxon test indicated that there was a near-significant reduction in melatonin at interval 180 min compared to the 150 min and 130 min intervals. In other words, exposure to L4 led to suppression of melatonin but only after an exposure of between 30 min and 1 h. This is in line with the result of Nowozin et al. 51
For the test phase, the Kruskal–Wallis test suggests a significant difference between the groups at the 150 min and 180 min intervals (i.e. 30 min after start of the test phase, p = 0.001; p = 0.004; respectively).
Here, we analysed melatonin suppression as the percentage change in melatonin following previous work.35,65,66 Papamichael et al. 35 analysed absolute melatonin levels in addition to percentage melatonin suppression. Repeating analysis of the current data using absolute melatonin levels reached the same conclusions except for one change. Using the Kruskal–Wallis test to examine between-subject differences at a given time interval suggested a significant difference at only the 180 min interval (p = 0.009), whereas analysis of percentage melatonin suppression suggested a significant difference at the 150 min and 180 min intervals. Pairwise comparisons using the Mann–Whitney test suggested significantly higher melatonin suppression under light condition L4 than under the other light conditions.
3.4 Self-reported sleepiness
Figure 8 shows the median KSS scores at each test interval. The Friedman test indicated a statistically significant (p < 0.0001) change in KSS scores across the measurement intervals. Overall, there is a progressive decrease in the KSS score as the measurement interval nears habitual bedtimes, reflecting a tendency to report feeling more sleepy at these times. Pairwise Wilcoxon tests suggested that the KSS score at intervals 5 min and 30 min were significantly higher (feeling more alert) than all other intervals (p < 0.05 in each case); the KSS score at the 5 min interval was significantly higher than at 30 min (p < 0.05); and the KSS score at the 60 min interval was significantly higher than at 110 min (p < 0.05).

Median KSS scores reported at each test interval. Error bars show the IQR. Note for KSS score: 1 = very sleepy, 9 = extremely alert: Shading distinguishes between the adaptation and test phases
There is a significant increase (p < 0.05) of about one unit of the KSS score between measurements at 110 min and 130 min, which coincides with the participants’ transition from being seated to walking on the treadmill. This decrease in perceived sleepiness is an expected result of the physical activity undertaken. 67 Subsequently, at intervals 150 min and 180 min, the KSS score returned to the pre-walking sleepiness level of 110 min, with pairwise differences between 110 min and either 150 min or 180 min not suggested to be significant (p > 0.05).
The Kruskal–Wallis test was used to determine group differences at the same interval. For the adaptation phase, there were no significant differences (p ≥ 0.45 in each case), which suggests a fair distribution of participants across the four lighting conditions.
The effect of light condition on KSS was tested using the Friedman test, comparing KSS scores in the final interval in the adaptation phase (110 min) with the three intervals in the test phase (Figure 9).

Median KSS scores reported at the last interval in the adaptation phase (110 min) and at each test interval during the test phase according to the light condition during the test phase. Error bars show the IQR, KSS score: 1 = very sleepy, 9 = extremely alert. Lighting conditions defined here by melanopic EDI because the photopic illuminances for L2 and L3 are identical (8 lx)
For light conditions L1 and L4, the Friedman test suggested significant changes across the measurement interval (p < 0.044): pairwise comparisons using the Wilcoxon test revealed only one significant effect, an increase in KSS score (i.e. feeling more alert) between the 110 min and 180 min intervals (p < 0.05) under light condition L4. For lighting condition L3 the changes were suggested by Friedman to be near significant (p = 0.07), but pairwise tests did not reveal any significant differences; for L2, the Friedman test did not suggest any significant differences (p = 0.62). Within the test phase, the Kruskal–Wallis test did not suggest significant differences between the groups (p ≥ 0.18 in each case).
The KSS data therefore again shows that the lighting condition of highest (100 lx) melanopic EDI (L4) led to an effect on an alertness measure, perceived sleepiness, but only after an exposure of somewhere between 30 min and 60 min. Responses to rating scale data are notoriously noisy, 68 and this may explain why differences between measurement intervals revealed within the overall dataset were less prominent when analysing the smaller samples of the individual lighting condition groups.
4. Discussion
An experiment was conducted to investigate the effect of lighting on alertness and melatonin in a context simulating a typical pattern of pedestrian exposure to lighting in the evening. Three dependent variables were measured: RT to an acoustic stimulus, melatonin derived from saliva and self-reported sleepiness. This extended previous work by using a pattern of light exposure and activity level better resembling pedestrian activity in the evening and by using a control (extreme) light condition to confirm the null findings of previous work.
Overall, none of the lighting conditions L1, L2 and L3 had a significant effect on melatonin, RT to an auditory stimulus or self-reported sleepiness. Those conditions had photopic illuminances of <0.5 lx to 8 lx (melanopic EDIs of up to 10.4 lx). When this was increased to 83 lx (melanopic EDI of 98.8 lx, lighting condition L4), the experiment revealed significant effects, specifically a reduction in RT, suppression of the increase in melatonin and an increase in perceived alertness. This implies that, under an evening, moderate-walking protocol, exposure for about 1 h to a photopic illuminance in the range of 8 lx to 83 lx (a melanopic EDI somewhere in the range of 10.4 lx to 98.8 lx) is the threshold at which NIF responses would be triggered and further work would be required to define that threshold if required.
Across the 3 h duration of the experiment, melatonin levels increased and KSS scores decreased, which are trends in the expected directions as habitual bedtime is approached.69,70 For the PVT data, there were notable reductions in RT from the first to subsequent intervals (which can be ascribed to a learning effect 71 ) and from the penultimate to final intervals, which can be ascribed to participants being aware that the experiment was nearly finished, an end-spurt effect where performance declines with time-on-task and then improves as the task approaches completion.72,73
That lighting conditions L1, L2 and L3 did not lead to significant differences in the dependent variables confirms the results reported in previous studies,38–40 and for those previous studies, the photopic illuminances ranged from <0.5 lx to 8 lx (and melanopic EDI ranged from 0.5 lx to 10.4 lx). However, the condition (L4) labelled as extreme in the current work, having the photopic illuminance of 83 lx (melanopic EDI of 98.8 lx), resulted in significant reductions RT and a near-significant suppression of melatonin after 30 min of exposure (the 150 min interval). These findings suggest that exposure to 100 lx melanopic EDI for 30 min would be required to suppress an increase in melatonin and also to enhance alertness. This is not a proposal that such an extreme lighting condition be used in road lighting design: its inclusion was for experimental control to support that the experiment could induce an effect, thus to validate the null findings of previous work.
We repeated the previous study 40 with the addition of an extreme condition following the recommendation to do so by Veitch et al. 41 An extreme condition is one which, according to the literature, will undoubtedly reveal an effect if the experiment is correctly designed. The inclusion of an extreme condition is therefore one means for testing the experimental design, and supports the conclusion of no effect when less extreme test conditions are used.
In this work, we characterised the lighting conditions using photopic illuminance and melanopic EDI, the latter being the recommended approach.21,22 Melanopic EDI assumes that ipRGC-driven responses dominate NIF outcomes; however, literature suggests a more complex interplay involving cones and rods, particularly at lower light levels.18,20 Therefore, we also report (Table 2) the other alpha-opic values for each light condition to enable further analysis by others, as recommended by Knoop et al. 74 For any of these values, light condition L4 has the largest magnitude and would therefore be described as extreme using any of the values.
A limitation of this study is that only a young sample was included, these aged between 18 years and 30 years. Older people are expected to have different responses to light. With increasing age, there are changes in the central visual pathways,75,76 which can influence NIF responses; older people may have different sleep patterns and circadian rhythms than younger people 77 ; finally, there is an age-related decline in cognitive performance, where older individuals can experience reduced cognitive performance compared to younger individuals. 78 Future research should consider incorporating a more diverse age range to better understand the potential variations in the non-visual effects of road lighting on alertness across different age groups.
A further limitation is that day-time light exposure was not controlled; although participants reported their time spent outdoors on the day of the experiment, this was not standardised and may have influenced the outcomes.
The experiment was designed to maintain a walking speed offering moderate-intensity exercise. Further work is required to determine whether walking speed varies between daylight and after dark or between times of day when there may be a natural variation in alertness.
5. Conclusion
An experiment was carried out to examine how lighting influences alertness and melatonin in a context resembling pedestrian exposure in the evening. Three dependent variables were measured, RT to an auditory stimulus, melatonin derived from saliva samples and subjective sleepiness. For photopic illuminances up to 8 lx (melanopic EDI of up to 10 lx), these data do not suggest an effect on alertness or melatonin, confirming the findings of previous studies using a similar melanopic EDI upper limit.38–40
However, increasing the photopic illuminance to 83 lx (melanopic EDI to 98.8 lx: lighting condition L4) revealed a reduction in RT and subjective sleepiness and a decrease in melatonin. The recommended average horizontal photopic illuminance for road lighting for pedestrians range from 2 lx to 15 lx. 1 Lighting condition L4 used a photopic illuminance of 83 lx at the eye with a CCT of 5800 K: to reach the same melanopic EDI with a CCT of 2700 K would require a photopic illuminance at the eye of 230 lx. Thus, while lighting can affect the alertness of pedestrians, the results of this study suggest the conditions required are unlikely to be found in road lighting applications. This is not a suggestion that light levels for pedestrians should be raised, but to suggest that variation of lighting conditions within those conventionally used is unlikely to have a significant effect on NIF. Moreover, current road lighting design increasingly emphasises reducing light pollution and, where possible, avoids illuminating vertical surfaces at eye level, which further limits the likelihood of such conditions occurring in practice.
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was conducted within the LightCAP project which received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no. 860613.
