Abstract
A bicycle bell is an important safety add-on and is used by the vast majority of bicycle users. Traditional bicycle bells make a single sound based on the construction of the bell, ranging from the gentle ring of a bell to the honk of a horn. A new wave of digital bell is now appearing which can make a range of sounds. One of the challenges of a digital bell is the issue as to what sorts of sounds should be made, how they will be responded to, and whether they are perceived as suitable for use in a digital bell. In this paper we present a study on the localizability of a set of digital bicycle bells, and how they are perceived along a range of relevant perceptual and aesthetic dimensions.
Background
Bicycle bells are seen as an important safety measure by bicycle riders and pedestrians alike. In some parts of the world, including the EU and some US states, it is legally mandated that bicycles should possess bells as a safety measure. The range of bicycle bells available is enormous, ranging from the traditional bicycle bell, to the more spherical and pleasant-sounding bell, usually with a very long-lasting ring, to more klaxon- and horn-like bells which have more aggressive and often louder sounds. Some bells are designed specifically for riding in the countryside and on trails, where the ringer can be set to stay on until the rider turns it off. The common feature of all of these types of bells is that they are analogue devices, they tend to make only one sound, and the sound it makes is linked to the structure of the bell. The gentle sound made by a half-spherical, metallic bell is very different to that made by a long, horn-shaped structure designed expressly to make the loudest sound possible with a relatively small device.
It is now possible to create digital bicycle bells. These bells can produce more than one sound, and the sound used might be selected by the rider from a range of alternatives. While the resulting sound will be somewhat determined by the structure of the bell, particularly its frequency response if it relies on a loudspeaker, the use of digital sound files in these devices means that it is now possible to enhance traditional sounds, as well as move away from traditional sounds and create new and potentially effective sounds.
Riders purchase bicycle bells for a number of reasons although safety is typically the primary one. The question as to what a rider looks for in terms from a bell’s potential performance as a safety device is not clearly understood, though bell loudness is an important selling point. Some bells available for purchase have potential dB outputs way above permissible and legal public loudness levels in most parts of Europe, for example. While safety, which often translates to loudness, is one important purchasing consideration, people also buy bicycle bells because they are appealing, either or both visually and auditorily. The sounds that come from a person’s bicycle bell could be seen as sending out a message about the rider: the rider is cool, has a sense of humor, is concerned about safety but doesn’t want to frighten you to death etc. All of these aspects of bicycle bell use and purchase come to the fore when considering the development of sounds for a digital bicycle bell. In this paper we present a laboratory study and a survey looking at several of these aspects.
The Project
This project is linked to a project to develop a smart digital bell. The smart bell is intended to play a range of different sounds, which will be selected by the rider prior to riding. Because bicycles are ridden in a range of different environments, the sounds are divided into three categories: ‘Polite’, which might be used in a suburban environment, or where there is no immediate threat of a collision, or where the rider wants to be cautious; ‘Urgent’, intended for use when there is immediate threat of a collision, and/or in a city or other busy environment; and ‘Ambient’, intended for use in the countryside or on a forest trail or similar. Of course, the rider will be at liberty to use the sounds as they wish, but these three categories are useful divisions for thinking about different types of sounds, what kinds of reactions are expected to them, and how those sounds should be designed in order to achieve the aims of the bell.
The sounds were designed by a sound designer on the basis of a remit from the sponsor of the research, working in tandem with the research team above.
Experiment 1: Localizabiity
The big advantage of developing a digital, smart bell is that the device can hold several sounds, and these sounds do not need to be exactly the same as existing bicycle bells. However, there are some important features of traditional bells which need to carry forward to smart bells. A key feature of a bicycle bell is that their main purpose is to serve as a safety measure, drawing other riders, pedestrians, and if possible car drivers to the proximity of a cycle. Thus in the first instance a bicycle bell needs to attract attention. Bells can attract attention by being loud, as a loud sound will travel farther than a quieter one in a free field, losing 6dB of power with every doubling of distance. However, loud sounds can also startle people so there is likely to be a point beyond which a bicycle bell does not need to be any louder, and might even be unnecessarily loud. There is much evidence of loud alarms being turned off before people attend to the problem, particularly in clinical environments where alarms are used extensively (Edworthy & Hellier, 2006). An alternative important factor in determining the location of a nearby bicycle might be localizability; that is, the ability to determine the precise location of a bicycle (for example, whether it is behind, in front, to the left or to the right of the observer). Being able to localize the position of a bicycle quickly (and thus get out of the way) is likely to be a contributing factor to avoiding an accident. The localizability of a sound is determined by the relationship between the structure of the sound and some basic features of human hearing and physiology (such as the Head-Related Transfer Function, HRTF, Xie, 2013). Localizability is also helped by using broadband sound or noise (sound with many frequency components over a large range), and by having some relatively low-frequency components in the sound (Blauert, 1997: Vaillancourt et al 2013).
One of the key challenges to the development of the smart bell is to develop sounds which might facilitate good localization, as well as to develop a device which would replicate the sounds well enough to conserve this move towards more effective and efficient localization. In the study presented here, we test the localizability of 16 sounds designed for potential inclusion in the smart bell, using a paradigm developed for testing the relative localizability of clinical alarms (Edworthy et al 2017; Edworthy et al 2018). These earlier studies showed that alarm sounds deliberately designed to possess many harmonics over a large range (in this case auditory icons, which are complex real-world sounds) results in better localizability than traditional alarms which were harmonically less complex.
Design and Sounds
Two sets of eight bicycle bells were designed by a sound designer, named as ‘Polite’ and ‘Urgent’. The ‘Polite’ sounds are intended for use in non-critical situations and less busy environments, where the rider wants to indicate that they are in the vicinity and it would be beneficial for the observer to move out of the way or take some kind of non-urgent action. The ’Urgent’ set are intended for critical use, where the danger of collision is imminent, or where it is very busy or crowded.
The designer generated eight sounds (reduced from a larger number of draft sounds, decided upon through a three-way process of selection and elimination between the designer, ourselves, and the sponsor), for each of the two categories, using as a basis either traditional bicycle bells, other types of bell, or wholly digitally-generated sound. Without manipulating loudness, the designer also imbued a sense of politeness in the Polite sounds and a more urgent sense into the Urgent sounds using factors such as pitch, speed, inharmoniousness and so on. The sounds did not systematically vary from one another, and there were not equal numbers of each type. The eight Polite sounds were labelled as follows: HP1-5, OP1, 2, WP1. The eight Urgent sounds were labelled HU1, OU1-3, WU1-4.
Method
Participants 31 participants took part in the study. 25 identified as female, 5 as male, and one as non-binary. All but two participants were 25 years of age or under. All were members of the Plymouth School of Psychology participant pool. None had uncorrected hearing loss.
Procedure A set of 8 Anker Soundcore minispeakers was arranged in a circle around the participant, spaced 40cm apart, 140cm away from the participant at a height of 118cm (Figure 1). Each of the 8 sounds was played 8 times, once from each speaker, and the participant was required to identify which of the speakers had sounded by clicking on the correct speaker as indicated in a layout mimicking the speakers’ positions on a Microsoft Surface tablet placed in front of them. They were given 8 seconds to respond, after which they were timed out and their response recorded as incorrect. There was an interval of 2 seconds between each trial. The sounds were played at a loudness level of 70dB at the speaker. There were 64 trials in one run, each run lasting approximately 15 minutes. Participants carried out two runs for each of the two sets of sounds, counterbalanced across participants. The whole procedure lasted approximately one hour.

Localizability setup.
Results
Figure 2 shows boxplots of the accuracy scores for each of the 16 sounds. The results show that localizability of all of the sounds was good, but was better for some than for others. Reaction Time data (not shown here) tended to follow the same pattern, with some sounds producing both higher accuracy scores and lower reaction time scores. There was no evidence of a speed-accuracy trade-off.

Box plots of accuracy scores.
A one-way analysis of variance (ANOVA) of the Polite sounds showed a statistically significant main effect forsound, with a medium effect size [F(7,154) = 0.007, p < 0.001, η2 = 0.07. The same analysis was carried out for the Urgent sounds, showing a significant effect for sound [F(7,154) 3.85, p < 0.001, η2 = 0.07]. For the Polite sounds, post hoc tests (Tukey’s HSD) showed that sound OP1 was localized more accurately than most of the other sounds. For the Urgent set, OU3 produced lower accuracy than some other sounds.
The Reaction Time data (not shown here) produced significant differences between some of the sounds. Most notably, reaction times to WU1 was considerably slower than all of the other sounds.
Experiment 2: Survey
Although safety is the key reason why cyclists purchase and use bicycle bells, it is not the only reason. The cyclist’s impressions of the sounds and what they say about the rider, as well as whether or not they like the sounds, is also important and will have some influence on whether a bell is first purchased and then used. People’s subjective impressions of the bells can also tell us a lot about the efficacy of the design – for example, subjective ratings of urgency and pleasantness can provide useful feedback as to whether or not the remit to design ‘Polite’ and ‘Urgent’ sounds has been met. In this second experiment we describe a survey in which we collected people’s perceptions of some important subjective aspects of the same bell sounds.
Method
Participants: 92 participants took part in the survey. 38 identified as female, 52 as male and 2 as non-binary. Twenty-four students were recruited from the University of Plymouth participant pool and 68 were recruited from the sponsor’s survey pool. 68 of the participants indicated that they were regular cyclists. 23 participants were aged 18-24, 28 from 25-34, 19 from 35-44, 13 from 45-54, 9 55-64 and one 65+.
Procedure
The same 16 sounds were used as in the Localizability study. A survey was set up on Qualtrics (2022), an online platform used by researchers and other investigators for setting up and collecting data from surveys, which participants carry out online. After logging in to the study and confirming their consent, participants heard each of the sounds, presented in a different random order for each participant (all 16 sounds were presented randomly). . They were asked to rate on a 1-9 Likert scale with 1 = ‘not at all’ and 9 = ‘very much’ the answer to each of the following questions:
How urgent do you perceive this sound to be?
How pleasant do you find this sound?
How ‘cool’or fun do you find this sound to be?
To what extent would you expect to hear this sound coming from a bicycle?
Participants were also asked demographic questions and some specific questions about their cycling activities.
Results
The mean scores for each sound on each measure are shown in Figure 3

Mean scores for each sound on each measure.
The survey shows some interesting findings which are useful for design purposes and for progressing the project. First, the data for both Pleasantness and Urgency show that the design remit for both of these sets of sounds has been met in broad principle. The Polite sounds generally obtained high or middle rating of Pleasantness, scoring higher on this dimension that all of the eight Urgent sounds. The same is true for the ‘Urgent’ judgment in that all of the Urgent sounds produced higher ratings than any of the Polite sounds. In addition, some of the sounds within the sets rate higher than others. Two of the Urgent sounds score very high on urgency, and on further listening to these two sounds, the general feeling was that they are possibly too urgent. Any sound scoring more than about 7.5-8 is possibly too urgent even for thw purpose of a bicycle bell. There are no similarly high scores for the Pleasantness measure, though it is hard to imagine a sound which is ‘too pleasant’.
The findings for Appropriateness and Coolness also show some interesting trends. Most interesting is that the Urgent sounds, particularly sounds WU1-4, which are all digitally-generated sounds similar to urgent alarms used in other environments, score low on both Coolness and Appropriateness. This suggest that the potential use of those sounds in the smart bell itself might need to be sparing, or cautious at least. In terms of appropriateness, it should be noted that the sounds in the Polite group were more similar to current bicycle bell sounds, which may account for the higher scores obtained for these sounds. However, what seems most apparent from the data for the Urgent sounds is that urgent sounds are not cool. Polite sounds can be cool, and the ‘best’ sounds can be polite, nonurgent, and cool.
Discussion
The two studies provide a first formal test of sixteen potential bicycle bells which might be used in a future smart bell. The localization data shows that each of the bells is highly localizable within a controlled laboratory setting. The localizability data also shows that there are differences in the localizability of the sounds. One particularly localizable sound was Polite sound OP1, a sound based on a cowbell with a large frequency component at around 300Hz. It is likely that this component helped with the localizability of this sound (Blauert, 1997; Vaillancourt et al, 2013). This was not true of all the most localizable sounds, however. For example HU2 performed equally as well but did not possess the same low frequency component. These findings are useful for design purposes and allow the designs to be revisited for further refinement, as well as for generating comparison metrics. The high level of performance obtained in the laboratory will likely fall as the sounds are subject to more rigorous field tests, where the limits of performance aspects such as audibility and localizability will be gauged in a real environment. In terms of generating possible metrics, the data shows that participants almost always correctly localized most of the bells, which suggests that localization is at least 8 times more accurate with a bell than with no bell. This metric can be used to emphasize why bicycle bells should be used and are an important safety measure and can also be further refined during field testing.
One of the key features of a smart bell is that it is capable of making more than one sound. Indeed,one of the key intentions of the smart bell is that it can make sounds which might be suitable in a relatively relaxed cycling setting (such as a forest or a suburban environment) as well as in a critical, time-sensitive one. These sounds will differ in quality from one another in several different ways, and the rider would expect the sounds to differ from one another. Our survey shows that sounds intended for different environments and circumstances differ considerably in several different ways. For example, all of the sounds intended for polite use are judged as sounding more polite than those intended for urgent use. The sounds intended for use in more urgent, critical situations and environments all are more urgent than the polite ones. The two factors also very clearly covary; sounds which are polite are not urgent, and sounds which are urgent are not polite.Thus the demarcation of these two categories should be very clear in the final product.
There are some other issues about the bells which are clear from the other two measures of ‘ Coolness’ and ‘Appropriateness’. Most strikingly, the urgent sounds score very low on ‘Coolness’, and the more urgent they are, the less cool they appear to be. Thus some caution needs to be exercised when implementing the most urgent of the alarms in the final product. ‘Coolness’ and ‘Appropriateness’ seem to be very clearly correlated, which is an interesting finding. We suggest that this may be partly because the sounds with the lowest coolness ratings are also sounds which listeners would not have heard in bicycle bells before (they are similar to digital alarms such as might be found in a control room or similar), whereas many of the other sounds (particularly most of the Polite sounds) are based on existing bicycle bell sounds.
The Appropriateness ratings, particularly for the Urgent sounds, are low, but we would expect this with new sounds as discussed above. We did not ask participants whether they ‘liked’ the sounds or not directly, because it is well known that people’s liking for stimuli varies with exposure, which people tending to like a stimulus the more they are exposed to it. Thus measures of a listener’s first response to a new sound is not that useful for market research purposes. For complex, unusual stimuli, the number of exposures needed to a stimulus before peak liking is reached is quite large (Berlyne, 1974). As would be expected, bells which sound like current bicycle bells (particularly the first five of the Polite sounds) are judged more acceptable than new, more adventurous but potentially effective and cool new sounds. Particularly successful among these new sounds was OP1, a sound based on a cowbell which scored relatively high on Pleasantness, low on Urgency, relatively high on ‘Coolness’ but lower on Appropriateness, as well as having high Localizability scores. With exposure, the appropriateness values should also rise.
In this paper we report only part of the project. In another part of the project we tested another set of 16 sounds. Of these, eight are intended for use in the countryside and on forest trails, so are different again from the Polite sounds (we refer to them as Ambient, and they may also play continuously rather than only intermittently). We also tested eight other sounds, mostly taken from existing bicycle bells, for comparison with the newly designed sounds.
Practical Takeaways
Localizability is a measurable and useful metric for demonstrating the comparative safety of bicycle bell sounds.
The level of pleasantness and urgency of a bicycle bell tone can be significantly varied, with one largely the inverse of the other. ‘Polite’ and ‘Urgent’ sounds can be used in different locations and riding situations.
In terms of “Coolness’, an important marketing metric, urgent sounds tend not to be seen as cool, whereas friendly, musical, and unexpected natural sounds can all be judged as cool.
Footnotes
Acknowledgements
This research was sponsored by Trek Bikes, Waterloo, WI, USA.
