Abstract
Humans use active touch to gain behaviourally relevant information from their environment, including information about contained objects. Although most common, the perceptual basis of interacting with containers remains largely unexplored. The first aim of this study was to determine how accurately people can sense, by touch only, the location of a contained rolling object. Experiment 1 used tubes containing physical balls and demonstrated a considerable degree of accuracy in estimating the rolled distance. The second aim was to identify the relative effectiveness of the various available physical cues. Experiment 2 employed virtual reality technology to present, in isolation and in various combinations, the constituent haptic cues produced by a rolling ball, which are, the mechanical noise during rolling, the jolts from an impact with an internal wall, and the intensity and timing of the jolts resulting from elastic bounces. The rolling noise was of primary importance to the perceptual estimation task suggesting that the implementation of the laws of motion is based on an analysis of the ball’s movement velocity. Although estimates became more accurate when the rolling and impact cues were combined, they were not necessarily more precise. The presence of elastic bounces did not affect performance.
Humans are expert at gaining behaviourally relevant information from their environment through the sense of touch or haptic perception (see Grunwald, 2008). To do so, humans possess a large repertoire of actions, such as the exploratory procedures documented by Lederman and Klatzky (1987). We lift and wield objects to obtain knowledge about their weight; we compress and bend objects to obtain knowledge about their material properties; and we enclose objects in our hands to obtain knowledge about their shape and volume (Carello & Turvey, 2017; Lederman & Klatzky, 1987). Such active engagements with objects have been the topic of scholarly discussion going back to at least the late 18th century (Wagner, 2016). They are typically referred to as active touch (Gibson, 1966) or dynamic touch (Turvey, 1996) and involve an instrumental contribution of motor effort and its consequent sensory signals. When wielding an object, various stimuli impinge on the receptors in the skin, tendons, and muscles; the stimulation, in turn, is a function of our movements and the inertial properties of the object. Active touch is capable of many perceptual feats (Gibson, 1966; Katz, 1989), including estimating the length of a rod only from wielding it (Solomon & Turvey, 1988; Turvey, 1996) or estimating the location of impact on a handheld tool (Miller et al., 2017; Okazaki & Kajimoto, 2014).
Sensing content by touch
Here, we extend our interest from solid objects to contained objects. The pertinent feature of containers is their ability to provide sensory signals related to their content. This is evident in common daily experiences such as when shaking a juice can or rattling a box of chocolate sprinkles to estimate whether there is enough left to drink or to eat (Hirota & Sekiguchi, 2009; Plaisier & Smeets, 2017; Tanaka & Hirota, 2012). These types of everyday experiences have inspired engineers to develop devices that mimic containers with virtual contents, such as balls in a box (Linjama et al., 2005; Sekiguchi et al., 2005; Williamson et al., 2007), balls in a cup (Minamizawa et al., 2012), liquids in a bottle (e.g., Koshiyama et al., 2015), or balls inside a tube (Yao & Hayward, 2006).
Balls in a box
Previous studies investigated people’s ability to obtain information about content in natural settings. Plaisier and Smeets (2017) described an experiment to test the hypothesis that people can estimate an exact number of objects (i.e., wooden balls) inside a box, provided that the number of objects is small enough. The number of balls placed inside a box could be surmised from either hearing or feeling the consequences of shaking the box. Their numerosity was estimated from the collisions of the balls with the container walls, from the collisions between the balls themselves, and from the rolling movement of the balls. Participants in the experiment verbally estimated the number of balls inside a cardboard box after having manipulated it for 5 s. They performed this estimation under two conditions, one in which they received auditory and haptic information from handling the box and one in which auditory recordings from that handling were passively played back. The results showed that participants could perform the perceptual task accurately when the number of balls was between one and three but tended to underestimate their number when it was larger. The main difference between the two conditions was that performance was considerably more variable in the auditory-only condition. The authors attribute this difference to the involvement of a multisensory cue integration process, which produced more precise sensory estimates when participants are provided with congruent sensory information across different senses (Ernst & Bülthoff, 2004).
Our ability to accurately estimate numerosity when the number of balls is small is also reported by Sekiguchi et al. (2005). These investigators employed augmented reality techniques to present participants with a consecutive pair of real boxes each containing several virtual objects (between one and five). The participants’ task was to indicate which of the two boxes contained the most objects. Participants had a nearly perfect ability to report the difference between a box with one single object and a box with more than one object (100% correct), and a near-perfect ability to report the difference between a box with two objects and a box with more than two objects (better than 96%). Performance dropped when discriminating between three objects and four or five objects (92% and 80%, respectively). When four objects were compared with five objects, performance was at chance level (56%).
The container studies of Plaisier and Smeets (2017) and Sekiguchi et al. (2005) demonstrated that people can achieve a considerable level of accuracy in estimating the number of objects inside a container. There were, however, key differences between having only one ball in the container and having more than one ball. With only one ball, there can only be one impact with a wall at any given instant and the cues arising from rolling may arise from a single ball. The addition of a second ball, or more, introduces ambiguous information given the possibility of concurrent impacts, mutual collisions between balls, and superposed rolling vibrations.
One ball in a tube
The studies summarised above clearly reveal that wielding a container produces appropriate haptic feedback that allows observers to make inferences about the container’s content. These studies, however, did not intend to reveal strategies that their participants employed in making their perceptual inferences. A virtual ball apparatus introduced by Yao and Hayward (2006) enabled the investigation of perceptual strategies as the cues relevant to the task of sensing the location of a freely moving object in a container were generated artificially, thus simplifying the complicated stimulus arrangement of previous box studies. The apparatus comprised a single rigid tube housing a vibrotactile transducer that could be programmed to reproduce key components of the mechanics of a ball rolling inside a tube. These components were (1) the pattern of vibration caused by a ball rolling down the rough inner surface of a tube, (2) the intensity of the impact felt when a ball encountered an internal wall, and (3) the timing of the elastic rebound following impact against an internal wall. Importantly the apparatus could render these cues virtually in any desired combination and properties (See Figure 1a). What the apparatus did not reproduce was (4) the variation of torque as a function of the distance of the ball from the fulcrum.

Apparatuses used in the experiments. (a) A handle contained a recoil actuator and an accelerometer. (b) In Experiment 1, the handle was connected to interchangeable tubes with internal walls and containing a metal ball, with the actuator disengaged. In Experiment 2, the handle could also be connected to a tube without a ball. An external computer (in panel a) simulated and synthesised vibration signals for rolling, impact, and bounce that were transduced by the recoil actuator.
Yao and Hayward (2006) found preliminary evidence that human participants could, through touch, spontaneously perceive the location of a ball rolling inside the cavity of the handheld tube. Without any preliminary training, participants were informed that there were three tubes of equal length but with inner cavities of three different lengths: 18, 24, or 60 cm. In separate conditions, the apparatus rendered either the rolling vibration or the impact of the ball against inner walls. Participants were divided equally among the two conditions and had the task to estimate the length of the cavity by tilting the tube in a controlled two-phase motion: first tipping downwards, then lifting upwards. Following a three-alternative forced-choice paradigm, participants reported their answer by pointing to one of three markings on the tube. Overall, participants could solve the perceptual problem. They gave the correct answer on most trials, although there were two peculiarities in the results. In the rolling vibration condition, the 24-cm virtual length tended to be underestimated to 18 cm (63% underestimations compared with 28% correct identifications). In the impact condition, the 18-cm virtual length was overestimated to 24 cm (63% compared with 20% correct identifications). Interestingly, according to self-reports, participants performed the task by imagining the movement of the virtual ball with their mind’s eye and, indeed, seemed to spontaneously “track” the virtual ball with their gaze during trials.
The virtual simulations provided sufficient information for the participants to perform the task with reasonable success. The between-subject experiment design, however, provided an incomplete picture of the way the participants utilised the sensory information available to them to sense the location of the ball. The experiment did not have a condition where the rolling vibration and the intensity of impact were available together. The elastic rebound following impact was also absent from the testing conditions.
Present study
This study has two aims. The first is to determine how accurately people can sense by touch only the location of an unseen moving rolling object. The second is to identify the relative effectiveness of the various cues available in the natural world by controlling the access to the different sources of information (Cabe, 2010). The experiments reported here were conducted according to the guidelines set out in the Declaration of Helsinki and were approved by the McGill University research ethics board.
Experiment 1: accuracy of locating unseen rolling balls
Untrained participants were asked to move a tube containing a real ball using either small or large angles. The proximal source of sensory stimulation was vibrations impinging on the hand holding the tube.
The two angles were included to preclude a trivial interpretation of the results, namely that participants performed the task using a simple timing heuristic. There is a simple relationship between distance and travel time: longer distances require more time than shorter distances. If participants relied on this heuristic, then estimated distances for large angles should be smaller than estimated distances for small angles because the ball rolls faster with large angles and therefore rolls for a shorter period. If, however, participants were genuinely able to perceive the motion of the ball, the movement angle should be inconsequential to the distance estimates.
Method
Apparatus
There was a total of four black 60 cm opaque fibreglass tubes. The tubes contained a physical ball and had an internal wall set at 20, 30, 40, or 50 cm from the base. The tubes were hidden from the participants’ view behind a partition. An accelerometer and an actuator were concealed inside a detachable handle (see Figure 1), which was attached, on every trial, and in plain view of the participant, to whatever tube was handed to them next. Participants wore headphones playing white noise that effectively masked the sound of the rolling ball.
Procedure
Nine participants (age range: 19–29 years; five females) were handed one of the four tubes with the handle attached and asked to move the tube in a prescribed manner: Up to five movements were allowed in the fronto-parallel plane. The tube was always handed to the participant’s dominant hand, and in an upward angle so that the ball was at the base, near the hand. Thus, the typical movement sequence was down-up-down-up-down, although participants were allowed to provide their answer as soon as they felt ready.
Participants indicated the final resting location of the ball by sliding a rubber hair band along the surface of the tube from its initial position at the tube’s base, near the participant’s hand. The participant then handed the tube back to the experimenter who measured the position of the hair band against a measuring tape affixed to the desk in front of them. The position of the band was read off to the nearest 5 mm after which the band was reset to its initial position. The answer was entered into a custom computer program that then gave the next randomised condition.
The task was performed under two conditions. Participants made either large or small angles to create different rolling velocities of the ball. They were instructed that any movement angle was fine as long as there was a noticeable difference between the two conditions. The actual amount of the tilt was calculated after the fact from recordings from the accelerometer in the handle. Prearranged hand signals were used to tell the participant what size angle was required on any given trial. Each combination of target distance and movement angle was tested four times for a total of 32 completely randomised trials. Short breaks between trials were allowed whenever the participants requested them.
Before actual testing started, participants were allowed a brief familiarisation—without explicit instruction about the angle of movement—of the feel of the ball rolling through the tube. During this familiarisation, participants were shown the actual distance the ball had rolled (by means of a marker on the outside of the tube, in this case, at 30 cm). This lasted only 30 s or 10 movements, whichever came first.
Results and discussion
The results demonstrated that participants were remarkably apt at spontaneously differentiating and estimating the various rolling distances (Figure 2a) albeit with a tendency to underestimate distance as rolling duration increased. All participants produced significantly different angles. Paired t-tests, one for each participant, showed that all t-values were >10.0 and all p-values <.001 (Figure 2b). Despite the significantly different angles produced, there was no difference in the distance estimates between the small and large angle conditions—repeated-measures analysis of variance (ANOVA): F(1, 8) ~ 0, MSE = 0.347, p = .993,

Results of Experiment 1. (a) Average response of all participants: estimated distance versus actual distance. Error bars represent 95% empirical bootstrap confidence intervals based on 10,000 bootstrap samples. (b) Mean angles produced by the participants. Error bars represent 2 SDs. (c) Individual length estimates made using large angles versus those made using small angles. Each symbol represents the average across the four replications of the corresponding condition for one of the participants.
A second experiment was designed to uncover the different types of tactile information potentially contributing to the perceptual estimation of the location of rolling objects. To provide a basis for this design, we next discuss an analysis of the tactile signals available from this kind of object. It is instructive to see sensory cues as being associated with invariant relationships between sensory signals or to invariant relationships between motor command and sensory signals. Like for other sensory modalities, tactile invariants can arise from the organisation of the sensory system, from inherent mathematical properties, or from the natural behaviour of objects (Hayward, 2008). The macroscopic mechanical invariants identified in the analysis arise from the law of energy conservation.
Movement duration
Galileo discovered that rolling objects on planes inclined by an angle,
Movement velocity
The rolling vibrations provide information to estimate movement velocity, even if no assumption is made about the action of gravity. These vibrations
Speed at the instant of impact
Collisions are mechanical events in which kinetic energy is lost during a brief instant. In 1656, some 30 years before Newton’s Principia, Christiaan Huygens used collisions to formulate the aforementioned law of conservation of momentum, which applies even when kinetic energy is lost during impact. A concept that follows from the conservation of momentum in collisions is the coefficient of restitution,
In summary, solving the perceptual problem requires some or all the available signals as well as some prior knowledge. The analysis above identified the available signals to be the inclination,
Experiment 2: relative effectiveness of available cues
Method
Apparatus
The same four tubes and handle from Experiment 1 were used. A fifth tube without any ball was added for a virtual ball condition, which consisted of virtual haptics technology very similar to the one described by Yao and Hayward (2006) and illustrated in Figure 1.
To good approximation, the accelerometer returned a measurement that was directly proportional to the acceleration of a virtual ball. A computer simulated the key aspects of the physics described earlier and in Figure 1. In a real-time loop, the computer read the virtual acceleration,
To synthesise the rolling noise, a source waveform made of a rectified sine wave provided a generating function,
Procedure
Eight new participants (age range: 19–27 years; five females) were tested using the same task as in Experiment 1 except for two small procedural differences. First, no more explicit instruction was given about movement angles. Second the familiarisation phase now consisted of feeling two real rolling balls (a short and a long distance) and one virtual one, this time without any corrective feedback. As before, participants wore headphones playing white noise that effectively masked the sound of the rolling ball.
The task was performed under six conditions, five of which involved a virtual ball that rendered (1) ball impacts only; (2) the rolling only; (3) the combination of impacts and rolling; (4) the combination of impacts and bounces; and (5) the combination of impact, rolling, and bounces. A sixth condition, using real balls served as a replication of Experiment 1 as well as a reference for assessing performance in the virtual ball conditions. Each combination of ball condition and the four target distances was tested four times for a total of 96 completely randomised trials.
At no point before or during the experiment were the participants informed about any simulation or virtual balls. In other words, as far as the participants were concerned, the entire experiment was conducted with physical balls. In fact, in informal debriefs after the experiment, once they were informed that they had experienced a simulation, participants typically expressed surprise and, occasionally, disbelief. They were genuinely under the impression of having manipulated a real ball and they frequently asked for the permission to inspect the inside of the tube (for similar participant responses, see the study by Yao & Hayward, 2006).
Results and discussion
Inspection of Figure 3a reveals that performance with real balls was in close agreement with that of Experiment 1. Participants were remarkably good at estimating the rolled distance, and again there was a tendency to underestimate the distance with increasing pre-set length.

Results of Experiment 2. For all panels, filled markers indicate conditions that include bouncing. (a) Average response of all participants: estimated distance versus actual distance. Error bars were omitted for legibility. (Legend also applies for panel b.) (b) From left to right, the mean of estimates normalised relative to performance with the real ball, the overall mean across the four target distances, the individual overall means. Error bars represent 95% empirical bootstrap confidence intervals based on 10,000 bootstrap samples.
Data were reduced to address the experiment’s main question—which tactile information was utilised in the estimations. First, for each participant, estimates were normalised with respect to their performance with the real ball. This step effectively detrended the estimates with respect to rolling distance (see Figure 3b, leftmost panel). Second, the normalised estimates were collapsed across rolling distance to obtain an overall measure of task performance (see Figure 3b, middle panel).
The virtual balls conditions revealed that the impacts alone are inadequate as rolling distance was grossly underestimated, and for most participants, this was the worst performing condition (see the green and red markers in the panel for individual performance of Figure 3b for exceptions). It also had the largest inter-individual variance. On the contrary, participants performed well with only rolling information, and even better when rolling and impacts were both available. The effect of bounce was small (a 5% improvement) but non-significant—F(1, 7) = 4.13, MSE = 0.006, p = .08,
A repeated-measures ANOVA showed a significant effect of the type of cue—F(2, 14) = 8.39, MSE = 0.360, p = .019,
It seems that participants primarily used the rolling information, suggesting that the implementation of the laws of motion is based on an analysis of the ball’s movement velocity. As the impact cue was largely ineffective on its own, we propose that its contribution was by adding energy to the sensory signal that contributed to the perception of the roll velocity. Next, the addition of bounces did not substantially affect performance, which suggests that we can rule out the use of the principle of conservation of energy as effective stimulus information. Finally, even in the condition with all information available, there is an underestimation of length with respect to the real ball condition. This could be construed as indirect evidence for the effectiveness of the Archimedean torque, which was necessarily missing from the virtual ball conditions.
General discussion
This study set out to determine how accurately humans can perceive the motion of a ball rolling inside a tube and to tease out the effective stimulus information they employed. The work extends the results of the earlier rolling ball study by Yao and Hayward (2006) in that its more ecologically valid experimental approach yielded a better quantitative estimate of perceptual accuracy as well as a better notion of the relative efficacy of the various kinds of stimulation. The results demonstrate a reliable ability to perceive the motion of a real ball inside a tube. On average, participants demonstrated a considerable degree of accuracy in estimating the rolled distance, although there was a tendency to underestimate as distance increased. The results appear to be in line with the level of accuracy observed in the balls-in-a-box experiments (Plaisier & Smeets, 2017; Sekiguchi et al., 2005) as well as with the tendency to underestimate (Plaisier & Smeets, 2017).
The study provides several hints as to the perceptual strategies employed by the participants in solving the perceptual task. It appears that the most important information is obtained from the rolling noise. The next most important information comes from impact. The bounce, on the other hand, had effectively no role to play. In terms of the potential perceptual strategy underlying the observed performance, the results are consistent with a perceptual system that uses a combination of two strategies: roll duration and indirect access to object velocity, with the latter carrying the most weight. Although there is little evidence for a strategy that relies on the conservation of momentum, there remains the logical possibility that a strategy based on the laws of levers is at play.
This study also underscores the dominant role of vibratory signals in the perception of objects concealed in containers. It has been known for a long time that several types of mechanoreceptors could subserve the detection of the vibrations of handheld objects. Earlier studies highlight the likely role of the Pacinian corpuscles that have been repeatedly found to be exquisitely responsive to vibrations transmitted by objects in contact with the hand (Cauna & Mannan, 1958; Goodwin et al., 1981; Johansson & Vallbo, 1979). Mechanoreceptor populations possibly also include far-flung muscle spindles receptors (Brisben et al., 1999; Libouton et al., 2012) as well as receptors normally associated with static contact (Gottschaldt & Vahle-Hinz, 1981). Recently, the mechanical waves propagating in the tissues of the upper extremity have been implicated in the transmission of tactile signals (Delhaye et al., 2012; Manfredi et al., 2012; Shao et al., 2016). The central brain processes are largely unknown for the task at hand but are likely to involve a network of sub-cortical structures that include at least the cuneate nucleus (Jörntell et al., 2014), the cerebellum (Blakemore et al., 1999), basal ganglia, and thalamic nuclei, resulting in the activation of a neo-cortical network that presumably includes posterior parietal cortex areas (Ehrsson et al., 2003; Miller et al., 2019; Van Boven et al., 2005), in addition to the primary somatosensory and motor areas (Kaas, 2012). The perceptual task, per se, is likely to involve a complex network, including visual (Ricciardi et al., 2007) and pre-frontal areas (Pleger et al., 2006).
Having demonstrated the ability to accurately perceive the motion of a real ball inside a tube opens several avenues for future research on the haptic perception of container content. First, it is noteworthy that the ability did not require dedicated task training because participants were given no more than a very brief familiarisation period. Apparently, there is enough information in the stimulus to accomplish this uncommon task and participants came to it with all the necessary perceptual tools already in place. Still, people did possess a lifetime of experience with handling container objects, which raises questions around the development of how people learn to obtain information from containers.
Second, behaviour is conducted in relation to a specific goal, which regularly requires adapting to changing circumstances by means of switching to different anatomical components and their related coordination patterns (e.g., switching hands) (Wagman & Hajnal, 2014). As a way of linking the various sensory components (receptors in the skin, joints, muscles, and connective tissues) with the body’s compression and tension elements (forces due to muscular contraction and from resistance to external loads), Turvey and Fonseca (2014) conceptualised the medium for the haptic perceptual system as a kind of tensegrity system. Such a system would be sensitive to invariants in the world, which, in turn, would support the perception of the objects from which those invariants originate. Empirical support for this comes from a study by Wagman and Hajnal (2014) who investigated the anatomical independence of the ability to perceive whether an inclined surface could be stood on. Across a series of experiments participants explored an inclined surface with an object that could be held in various ways, including with the preferred or nonpreferred hand, with both hands, and with the feet. The results confirmed that perception was virtually unaffected by whatever configurations of anatomical components were used. For our ball-in-a-tube scenario, a testable implication is that participants should be able to accurately perceive the ball’s motion even if they were to wield a tube with another part of the body part, say with their feet (Hajnal et al., 2007).
Third, the virtual ball display affords the investigation of how the perception of the ball changes if there are conflicting (or contradictory) cues. Specifically, what perception would manifest from a stimulus array in which the rolling and the impact components specify different balls. We could suspect several perceptual outcomes. First, we perceive two concurrent balls. Second, we perceived a single ball that is informed by one of the components and ignorant of the other. Third, we perceived a single ball that constitutes a compromise of the two components. Several of these outcomes can come to light depending on the characteristics of the discrepancies. Such work could be a complement to haptics research from the perspective of cue integration, which takes sensory information consisting of a constellation of cues. For instance, when moving a finger over a bumpy surface, there are position cues from following the surface geometry and force cues from running into the slopes of the surface (Robles-de-la-Torre & Hayward, 2001). One of the central questions in cue integration pertains to the relative weighting of the available cues, which is typically addressed by injecting sensory noise into the cues so as to manipulate their reliability and impose conflicts between the cues (Drewing & Ernst, 2006; Ernst & Bülthoff, 2004).
Finally, while the present results suggest that participants primarily use the movement velocity strategy, which does not necessarily require assumptions about gravity, they do not exclude the possibility that participants used knowledge about gravity. The possibility is reasonable given research suggesting that people have internal models of the laws of motion that they employ in perceiving and acting in the world (Lacquaniti et al., 2015). For instance, Ceccarelli et al. (2018) employed virtual reality to test the hypothesis that visual judgements of the naturalness of a ball rolling down an incline are consistent with Newtonian physics. Participants viewed a sphere rolling down an incline, while either the slope of the incline or the acceleration of the sphere were manipulated to be either consistent or inconsistent with terrestrial physics. The results demonstrated that participants were accurate at adjusting either the slope or acceleration so as to be consistent with Newtonian mechanics. Similarly, the virtual ball display allows for manipulations of terrestrial physics, specifically the gravitational constant, thus allowing for direct experimental tests of the role of gravity in the perceptual estimation of rolling ball motion and location.
Conclusion
This study is one of only very few container perception studies and shows that people can and do accurately track motion and position of a rolling ball simply from handling its container. This appears to be the first formal demonstration of the perception of the motion of a ball in a container and as such extends the human haptic system’s impressive repertoire of capabilities.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: All authors were supported by grants from the National Sciences and Engineering Research Council of Canada (NSERC). Vincent Hayward received additional funding from the Sorbonne Université, Paris.
Data available statement
Materials and analysis code for this study are available by emailing the corresponding author.
