Abstract
This study investigates the influence of smartphone use on the embodied experiences of pedestrians in urban public spaces. Participants in this study engaged in leisure walks in a bustling urban environment. The study employed a multi-sensory, multi-modal data collection approach, which incorporated mobile eye tracking, screen capture, think-aloud, and participant data review. Findings revealed dynamic relationships between the urban pedestrians’ embodied experiences (“streets as experienced through the body”), digital content (“streets on the screen”), and spatial knowledge (“streets in the mind”). The study explored the “unfolding” practice between multiple versions of the surrounding environment and sheds light on the complex interplay of cognitive, experiential, and digital inputs in shaping pedestrian actions. Moreover, the study uncovers the paradoxical effects of smartphone usage and introduces both serendipity and familiarity into the pedestrian's journey through public spaces. Further, the implications of this research highlight the need for mobile media studies to embrace “messy” and “noisy” data for a more comprehensive understanding of the interconnections between minds, bodies, places, and mobilities.
Walking in a public space is a fundamental way of place-sensing and place-making, especially in a walkable, city environment: “The act of walking is to the urban system what the speech act is to language or to the statements uttered” (de Certeau, 1984, p. 27). Walking is essential for developing a sense of place (Adams, 2001; Wunderlich, 2008). Adams argued that walking affords the multi-sensory experience necessary for developing an organic sense of place, which he calls the peripatetic sense of place, that could not be developed through media (Adams, 2001). However, Laurier et al. (2016) noted that walking has historically been a mediated practice that involves various forms of media, such as maps and guidebooks. Due to the widespread use of mobile phones, digitally mediated walking practices have emerged over the past two decades. Initially centered around phone calls and text messaging, the last decade witnessed a broader range of mediated activities with smartphones and location-aware applications. These new practices are gradually shaping city dwellers’ sense of place in nuanced ways. Research has found that the rate of pedestrians whose attention is fixated on a smartphone screen while walking in public spaces has increased in recent years (Argin et al., 2020a; Fernández et al., 2020).
Meanwhile, scholars have become aware that urban spaces have been transformed into hybrid spaces that consist of multiple layers of physical spaces, socially constructed spaces, social relationships, and digital data (Bentley et al., 2012; de Souza e Silva, 2006; Gordon & de Souza e Silva, 2011). In the mobile media literature, the interest in the relationship between mobile media, spaces, and places peaked in the recent decade, with the emergence of extensive studies on mobile locative media. Researchers found that mobile locative media were associated with activity coordination (Frith, 2014; Humphreys, 2007; Licoppe, 2014; Xie, 2020), identity expression (Cramer et al., 2011; Evans, 2015; Rost et al., 2013), chance encounters (Licoppe, 2016; Licoppe et al., 2016), improving spatial knowledge (Bentley et al., 2012, 2015; Farrelly, 2012; Özkul, 2015), navigating through unfamiliar urban spaces (Kim & Lingel, 2016), and fostering a sense of place (Farrelly, 2014; Schwartz, 2015). There have also been a few concerns with these smartphone-mediated socio-spatial practices. Some scholars have made arguments about algorithms that are embedded in mobile applications that may have the potential to reduce urban chance encounters and serendipity (Foth, 2016; Zuckerman, 2011). Others have maintained that user-generated contents reinforce social biases about local communities (Frith, 2017; Zukin et al., 2017).
The above-mentioned opportunities and challenges that emerged from smartphone use may have a profound impact on users’ socio-spatial practices and their sense of place. Although existing research is abundant in reporting the general uses of smartphone apps in relation to place-making and place-sensing, more empirical research is needed to understand how these apps are used in situ.
Theoretical framework
Mediated social interactions in public spaces
Erving Goffman's (1963, 1971) works on social interactions in public spaces have been the basis of many research studies of mobile media use in public spaces. One specific research focus is related to the reconfiguration of co-presence in public spaces. Goffman perceived two types of individuals in public spaces, the “Withs” and the “Singles.” In Goffman's time, Withs and Singles were defined by the physical co-location of a company. In early ethnographic research of mobile phone uses in public, Humphreys (2005) found that Withs can be transformed into Singles, when one person in a co-located dyad answers the phone. Lasen (2006) found that Singles in public spaces who used their mobile phones did not simply disengage from their immediate surroundings. Instead, they and co-located others (strangers or companies) often collaboratively maintained interactions of various orders. On the other hand, researchers noticed that mobile media could create mediated Withs, thus turning Singles into Withs and creating multiple fronts of social interactions (Hampton et al., 2015; Hampton & Gupta, 2008; Humphreys & Hardeman, 2021).
In addition, Goffman (1971) differentiated between two things a person in a public space could be, a vehicular unit and a participation unit. Whereas much more attention has been paid to the individual-as-a-participation-unit in his writing (e.g., the Singles and the Withs), the less discussed individual-as-vehicular-unit is indeed crucial in mobile media and mobilities research (Jensen, 2006; Murtagh, 2001). An individual-as-a-vehicular-unit is a pedestrian who participates in the flow of street traffic. They do not simply walk. The pedestrian navigates through the public space by following the unspoken traffic “codes” and by constantly scanning their surroundings not only to avoid collision, but also to manage social encounters. Goffman also recognized that the pedestrian-as-a-vehicular-unit often partakes in other activities; that is, getting from point A to point B is seldom the sole purpose (Jensen, 2006). Jensen (2010) conceptualized that individuals traversing through urban spaces frequently slip in and out of “mobile withs” through “temporary congregations” with other co-present individuals. Further, he recognized that mobile media extends this notion of mobile withs to include remote others with whom the mobile user could interact while moving.
Smartphone apps, especially locative media apps, further complicate the meaning of co-presence in public spaces. Mobile media users are no longer interacting only with other acquainted users, but also with unacquainted users who share(d) the same space. Licoppe (2016, 2017) argued that mobile locative media could provide the affordance of chance encounters with nearby strangers on smartphone screens. He called these encounters pseudonymous strangers in contrast to the nameless strangers whom people could physically encounter in public spaces. As Goffman (1971) argued, a pedestrian in the public space constantly monitors their surroundings and performs various rituals to exhibit their indifference to strangers occupying the same space. Smartphone apps, especially those with locative features, can help the user to access the digital fronts of local places and form mobile withs with pseudonymous strangers on the screens. By placing the individual both “here” and “there” at the same time, this use of the smartphone may complicate the ways in which a pedestrian manages their encounters with others and local places.
Walking with smartphone as a multi-sensory, multi-modal experience
Walking in a public space is a multi-sensory, embodied experience. The pedestrian takes in information about their immediate surroundings through sight, hearing, and smell, among other senses. Farman (2012) argued that, with mobile media, our embodied engagement with the world is simultaneously through the mode of sensory and the mode of inscription. Argin et al. (2020b) found that, when performing a flânerie-type, walking-wandering task in an urban public space, participants who used their smartphones exhibited different bodily rhythms and visual attentions to their surroundings than their non-smartphone-user counterparts. By observing the use of a mobile map app by a pair of tourists, Laurier et al. (2016), found that the map app did not simply provide the users a “you-are-here dot.” The use and ongoing reconfiguration of the map app was reflexively tied to the tourists’ walking actions. Drawing on assemblage theory, Holton (2019) examined the mobility-technology assemblage that emerged in a city tour aided by a digital app. His findings revealed a complex web of relationships between the walkers’ mobilities, the mobile app, the environment, co-present others, and the embodied experience.
To understand the complex relationship between the pedestrian's embodied experience, digital information, the built environment, and social encounters in public spaces, this study draws on Christian Licoppe's concepts of the reflexive hybrid ecology and the unfolding practice afforded by mobile media (Licoppe, 2016; Licoppe & Figeac, 2018). These concepts stem from the notions of seams and hybrid ecology from the field of human–computer interaction (HCI). These two concepts are used to describe the way in which various sociotechnical infrastructures can create distinct domains of activities, as opposed to the notion of “seamless” integration of technologies in the lived experience. Licoppe (2016) argued that a core affordance of mobile locative media is that the “surroundings become simultaneously accessible through the embodied sensorium and through the mobile terminal, with both versions being somehow referred to an ego-centered ‘here and now’ ” (p. 101). A reflexive hybrid ecology, thus, is a sociotechnical environment in which “the digital experience and the live embodied experience of place are reflexively tied through proximity or location awareness” (p. 105). He further argued that, when using mobile locative media, the user's direct sense of the place and their digitally augmented sensing of proximal entities are often “unfolded” (similar to “unfolding” a paper map), in that the mismatches between the two articulated versions of the surroundings become salient to the user. This conceptualization offers methodological clarity for studying smartphone use in public spaces. It offers a feasible way to analyze the messy, interconnected nature of the complex sociotechnical system that is associated with mobile phone use in public spaces.
In Licoppe's theorization, unfolding is associated with the mismatch between two versions of the surrounding environment – one on the screen and the other though embodied experience. However, the everyday socio-spatial practices of urban pedestrians are often embedded in a complex web of sociotechnical systems, which may involve street grid, signage, digital displays, co-located people, and so on, as well as the mobile user's past experience and existing knowledge of the public space. As such, the goal of this study is to explore the unfolding practice between multiple versions of the surrounding environment.
Informed by this theoretical framework, this study aims to answer the following question. How does the use of a smartphone transform the urban pedestrian as a vehicular unit, in relation to making sense of their surroundings, as well as managing nearby encounters?
Research design
A multi-sensory, multi-modal data collection approach
The theoretical framework detailed in the previous sections requires a research design that captures mobile media users’ embodied experience and situated actions. Therefore, an ethnomethodology-informed field study was conducted in which participants engaged in a flânerie-type leisure walk in a public space. (A detailed description of the research procedure is in the next section.) An ethnomethodology-informed approach allows for a moment-by-moment examination of how pedestrian/mobile users’ actions are integrated with elements in the surroundings and on the screen. Ethnomethodological studies of movements often involve closely examining video/audio recordings to understand how participants’ actions are shaped in the situation by pondering the “why that (action) here?” type of questions (Laurier et al., 2016).
This methodological approach requires the collection of detailed data on the participants’ sensory experiences, smartphone use, and social encounters. Therefore, I used a method similar to subjective evidence based ethnography (SEBE; Lahlou, 2011; Lahlou et al., 2015), in which I combined mobile eye tracking (MET), screen capture, think-aloud, and participants’ own reflections. SEBE is a research method that involves two key steps. First, researchers capture video data from a first-person perspective using a wearable camera that is positioned at eye level. In this study, Pupil Labs mobile eye-trackers (Kassner et al., 2014) were used to capture the first-person-perspective videos and eye gazes. Wearable cameras and eye-trackers have been used in mobilities and mobile media studies that examined micro-level behaviors in various contexts (e.g., Argin et al., 2020b; Figeac & Chaulet, 2018; Guntarik et al., 2018; Heitmayer, 2021; Kiefer et al., 2014; Laurier et al., 2016; Licoppe & Figeac, 2018; Wilhoit & Kisselburgh, 2016). Licoppe and Figeac (2018) argued that video-recorded data are essential for making sense of situated technological practices. However, they warned that the video data obtained from the wearable cameras are only a proximation of what the participant actually sees. As Figure 1 shows, the participant wore the Pupil Lab eye-tracking headset (a) that was connected to a MacBook Air laptop (d) in a backpack, which they carried during the walk. The second key step in SEBE is a “replay interview” in which participants watch the recorded video and provide explanations. In this study, the replay interview was conducted immediately after the video data were collected. The combination of these two steps allows researchers to gather both objective data from the video footage and subjective insights from the participants.

A participant in this study walking in a shopping mall. Image of the eye-tracker (a) was taken from pupil lab’s official website.
Participants used their own smartphones in this study. The smartphone was connected to the research laptop through a USB cable (see Figure 1 (c)). During the study, all screen activities on the mobile phones were recorded through QuickTime (iPhones) or Android Debug Bridge (ADB) shell scripts (Android phones). At the time of the study, these methods of screen recording were the most unobtrusive and efficient, because they did not require the participants to install any third-party app on their phones. In a previous pilot study, I discovered that participants’ awareness of screen capture could influence their behavior and cause them to use their smartphones more frequently to “contribute” data for the study. To avoid this priming effect, I emphasized that the participants should use their smartphones as they normally would and reminded them that an artificial increase in their smartphone usage would not produce the data that I needed for my research.
I used the “think-aloud” method (Ericsson & Simon, 1993) to elicit the participants’ thought processes. Participants wore a lavalier microphone that was connected to the research laptop (Figure 1 (b)); they were instructed to say their thoughts aloud while walking. The microphone captured not only the participants’ in-situ reflections of their actions, but also the external audio from the environment. Lastly, throughout the study process, I shadowed each participant as they walked around in the public space. The main purpose was to assist the participants when necessary and to reduce their anxiety walking in a public space as “singles” with research equipment. This shadowing protocol was also an important data source, because it allowed me to gain first-hand, immersed experience of the participants’ mobilities (Lingel, 2013).
Participants and research procedure
This study was a part of a research project that examined city residents’ use of digital media and place-making practices in the city of Philadelphia. All participants were recruited through door-to-door visits in selected neighborhoods. Neighborhoods of various racial compositions and newcomer/long-term-resident ratios were selected to diversify the pool of participants. A subset of the participants completed the field study portion of the research. The findings in this report are based on data collected from 17 research sessions. The participants in this study came from different socioeconomic backgrounds and lived in neighborhoods of varying racial and cultural characteristics. As Table 1 indicates, the demographic characteristics of these participants were diverse.
List of participants, ordered by date of the study sessions.
Many previous field studies of the spatial behaviors of pedestrians have preferred experimental methods that rely on participants who do not have familiarity with the research site (e.g., Bertel et al., 2017; Kiefer et al., 2014; Münzer et al., 2006; Willis et al., 2009). Even in studies in which researchers conducted naturalistic observations (e.g., Holton, 2019; Laurier et al., 2016), researchers chose to study tourists, because tourists were more likely to use mobile media to explore their surroundings. However, research has shown that participants who are unfamiliar with the research location tend to look more at their surroundings and use their smartphones more for exploration (Argin et al., 2020b). In other words, using participants who are unfamiliar with the research location may produce data that reflect the perspective of a tourist or outsider. In contrast, the current study involved local residents, whose actions were driven by their everyday interests, needs, and socio-spatial practices as actual residents of the city.
Each participant met with the researcher near Rittenhouse Square in downtown Philadelphia. I selected this space as the research site, because it is known to be a pedestrian-friendly area that is often filled with diverse people, activities, and places of interest. Figure 2 shows a map of the core part of the research site, as well as selected street views taken from the video data. At the center is the Rittenhouse Square Park, often described by locals as the place for “people-watching.” To the north of the park is a busy, commercial area with tall buildings, big-brand stores, and a high volume of pedestrian traffic. To the south of the park is a quiet, residential area with small stores and restaurants.

A map of the research site, with screenshots from the video data, showing the views and traffic volume of different areas. Map tiles by Stamen Design (maps.stamen.com), licensed under CC-BY 3.0. Data by OpenStreetMap, licensed under CC-BY-SA.
Each participant was then given a hypothetical scenario in which they moved to a nearby neighborhood: “On a Sunday afternoon, you have some time to kill and decide to check out the surroundings.” They were instructed to explore freely the surroundings in any way they wished and to terminate the walk at their own discretion. This instruction was designed to give the participant an exploratory walking experience in a public space. This task was supplemented by two purposive walking tasks. All participants were instructed to visit two places before they completed their leisure walk: a T-Mobile store and a Chinese restaurant. Participants were instructed to act as if they had to run some errands on the same day. The names and addresses of the two places were sent via text messages to the participants’ mobile phones. The locations of these two places are shown in Figure 2. The two places were selected for this study for two reasons. First, they are generic places of which the participants—despite being locals —were unaware prior to this study. Their locations also guided the participants to traverse varying city spaces (both the busy/commercial areas and the quiet/residential areas). After the walk was completed, I sat down with the participant in a nearby coffee shop to review the video data together. The participant provided detailed comments about their actions observed in the video.
Data analysis
For each research session, eye-gaze visualization, first-person-perspective video, and audio recording were analyzed in Pupil Lab's Pupil Player computer program. Smartphone uses were analyzed by matching each on-screen activity to the corresponding video footage. In addition, the replay interview recording was transcribed and matched to the eye-tracking video data and the smartphone screen recordings. Data analysis in this study used an ethnomethodological approach and focused on the sequential relationships between a participant's in-situ embodied actions, sensory inputs, cognitive reflections, and mobile media use.
Data validity
Initially, concerns emerged about two factors that could possibly compromise the quality of the data:
The participants might rush to complete the study by visiting the two required destinations as quickly as possible (completing the purposive walking with no regard to the exploratory walking instruction). The eye-tracking device might cause the participants to act unnaturally in public.
However, the results were encouraging. The shortest walk took approximately 28 minutes, and the longest approximately 115 minutes. In contrast, if a participant wished to end the study as quickly as possible, it would take only 10 to 15 minutes to visit both places (approximately 900 meters of walking distance in total). As can be seen from the aggregated walking paths shown in Figure 3, many participants wandered quite far from where they began.

Aggregated paths from all research sessions.
The research equipment did not appear to cause the participants to feel awkward or behave unnaturally. Each participant was asked how they felt about wearing the eye-tracking headset in public. All participants maintained that they no longer minded it after walking for only a few minutes. These claims were supported by what many participants did during the walks. They felt comfortable entering shops and restaurants, making purchases, or talking with people on the street and in the stores. During all research sessions, there were only three instances in which other people (one pedestrian and two store employees) took notice of and briefly inquired about the eye-tracking device.
It should be noted that some participants reported that the awareness that their smartphone screen was being recorded affected (but not to a great extent) their use of interpersonal communication on the phone, such as reading/sending text messages and checking social media. However, all participants said that other types of smartphone uses were consistent with what they would do in their everyday lives.
Findings
Overview
During this study, most participants used their mobile phones for a variety of activities. Table 2 provides a summary of the activities that the participants performed on their phones. This table excludes the use of the mobile phone for checking the addresses of the two destinations sent to the participants through a text message. Most participants used mobile maps (Google Maps or Apple Maps) for the purposive walking tasks. Fourteen participants used map apps at least once. More than half of the participants used their phones for other purposes, such as exploring local places on Yelp, taking pictures, searching on Google, and taking notes.
Participants’ use of their mobile phone during the walk.
Streets in the eyes, streets on the screen, and streets in the mind
The purposive walking component revealed dynamic relationships between the visual input (“streets in the eyes”), the digital content (“streets on the screen”), and participants’ existing spatial knowledge (“streets in the mind”), among other situated factors.
Four participants (Leah, Matt, Chris, and Todd) used the navigation feature in the map apps to guide them to at least one of the two destinations. Using the turn-by-turn navigation app demanded that the participants monitor and compare the information on their phones, the immediate surroundings, and their existing knowledge of the area. Leah frequently checked her phone to make sure that she did not deviate from the instructed path. For example, when she was walking toward the park, her visual attention was initially on the buildings and store fronts (Figure 4, panes 1 & 2). Upon hearing the voice instruction given by the navigation app, she held up her phone and began to compare what she saw as the park (3, visual attention on the trees in the park) and what she saw on the screen (4). She said: “OK. Looks like I’m going to walk straight through … Rittenhouse (Square).” She then followed the navigation app's instruction and walked through the park. Throughout the navigation-guided part of the walk, she had to frequently perform this type of unfolding to compare what she was seeing, what the app was instructing her to do, and her knowledge of the area.

Leah approaching the park, following the instructions from the navigation app. Blue circles highlight where the eye gazes (red crosses) concentrate, for visual clarity.
Participants not only “unfolded” their mobile apps to compare digital information with the immediate surroundings, but also frequently “unfolded” their survey knowledge of the area. This occasionally caused frustration for some participants. Several times, Chris had to stop at various intersections to manage the gaps between the instructions given by the navigation app, the surroundings, and his memory. Figure 5 shows one instance in which he was frustratingly trying to figure out the right direction at an intersection. When asked about this during the replay interview, Chris explained: I was confused about which way to go, because the GPS was telling me to turn, but I knew, just from experience, that I could go straight. So, I just took that chance and took Google Maps’ way and kept that turn. . . . ’cause I’m like, it might save me a minute. It might be the best route for me, even though I had my own route.

Chris getting confused at an intersection. The arrow indicates north. The pedestrian icon indicates moving (walking). Eye gazes highlighted with blue circles. Some gaze points are out of bounds (e.g., when he looks down at the phone).
The majority of the participants used map apps only briefly to check the locations of the destinations. Four participants used no map app at all. This was because the city's easy-to-navigate street grid made wayfinding easier for local residents who understood the rules of the street numbers. Nevertheless, street addresses alone did not give the participants sufficient information to pinpoint the two locations accurately. The map app became useful in these cases. Figure 6 shows a series of on-screen actions performed by Ashley in an attempt to locate the Chinese restaurant. Moving continuously through the park, she looked down at the text messages and said: “So the first is … Szechuan Hunan Chinese Restaurant. I don’t know where that one is. I don’t know where (No.) 271 is on 20th Street. So I’m gonna go ahead and Google the restaurant.” She opened the Google app on her phone and typed in “schezchuan hunan.” The app gave her a few autocomplete suggestions. She tapped on “Szechuan hunan restaurant phildelphia pa,” which showed her the information about the restaurant. She clicked on the map to bring up a full-screen view of the map. She then zoomed in on the restaurant. The screen showed the restaurant's location as well as her current location (blue dot): “Oh! It's on Spruce. I kind of went the wrong way. I always walk this way because I live in this direction. So I guess I just started going this way anyway,” she said when gesturing in the direction in which she was walking. She then put her phone away and continued her walk to find the restaurant. Note that figuring out the route to the restaurant was not the result of seeing the blue dot on the screen, but it came after the realization that the restaurant was at the intersection of Spruce Street and 20th Street (“Oh! It's on Spruce”).

Ashley trying to locate the Chinese restaurant while walking in the park.
Although most participants planned the routes in their heads rather than in the map app, the quick glances at the map interface may have had an effect on their paths. When using mobile mapping apps, the users were exposed to information that allowed them to obtain a quick preview of the local places they might encounter. For example, when Chad searched for the Chinese restaurant in Apple Maps, he noticed the icons displayed on the map that represented businesses along South 20th Street (Figure 7). Glancing at the map had an effect on his later behaviors, as he explained during the data review: I just quickly glanced down a list of places I was going to pass, saw all the juiceries, and I guess the Food & Friends. That's what got to me. It definitely made me aware of what was along the road. And I guess, have them being listed vertically here and the way the street was—I kind of targeted my approach to walk down the street that way instead of coming down from the side, where there's really nothing on the screen there.

Screenshot of Chad checking out the location of the Chinese restaurant.
The mobile store front: various ways to interact with local places
When looking at the instances in which the participants used their phones for exploratory purposes, they often used smartphone apps to break the physical barriers in order to interact with local places. As Goffman (1963, 1971) noted, communication boundaries in public spaces are maintained by various physical and social barriers. Physical barriers, such as walls and closed doors, often prevent access. Glass windows allow partial access. Although many places in the public space are accessible, a pedestrian does not randomly enter any spaces that are open to them. For example, during this study, multiple participants entered local shops and restaurants. However, they were more likely to enter large shops or grocery stores simply to browse goods. And when they entered small shops where they had to interact with the employees, there were often clear purposes for doing so. For example, when Lindsey noticed a small apothecary, she only looked into the store through the glass, but only a few moments later she walked into a nearby grocery store and browsed the goods (Figure 8). Similarly, Natalie looked into a bakery through the glass, but did not enter the store. Later, she entered an optical shop to inquire about prescription glasses and entered a pizza shop and a coffee shop to inquire about and purchase foods. In these instances, she had conversations with the store employees, while sustaining eye contact with them. This shows that various public places have various levels of “openness” to them. As such, shops and restaurants often have window displays or menus posted at the entrance to allow for limited engagement.

Two participants engaging with various local stores. Eye-gaze areas highlighted with blue circles for clarity.
Smartphones allow alternative ways to access local places. Two types of smartphone uses for exploratory purposes were noted. The first was to use the smartphone to explore immediate places without physically accessing them. In one instance, Ral, a young student, walked past a bakery/restaurant called Spread Bagelry (Figure 9, 1). Approximately 16 seconds later, he paused his walking and began to search for “spread bagelry” in Google Search. He not only read the reviews in the “critic reviews” section of the result page but also clicked on the link that opened the menu on the bakery's website. When reviewing the data, he explained his actions: “If I walked by something, I might check the ratings and reviews and make a mental note. There are a lot of food and restaurant places in the city. I’ll make a note if I’m interested.” During his 50-minute walk, Ral had checked the information on 2 other restaurants after he walked past them.

Ral looking up Spread Bagelry on his phone.
This was a common practice performed by many participants. When walking across an intersection, Dan made some sniffing noises with his nose, saying: “Something smells really nice!” He then noticed a restaurant on the street corner (Figure 10, 1). After crossing the street, he took out his phone and opened the Yelp app to look up information about this restaurant (2). He kept reading detailed information about the restaurant while walking the entire block, approximately 150 meters in distance (3–6). During this time, he only occasionally looked up from his phone to avoid collision with people in the street (4 & 6). In this instance, mobile window shopping was triggered not by visual cues, but by olfactory cues in the surroundings. In this case, we also see how mobile window shopping could take place while the individual was constantly moving, managing his interactions both with people around him and the store front on the screen. In Goffman's terminology, the individual simultaneously acted as a mobile, vehicular unit through the physical space and a part of a participation unit on the screen.

Dan looking up information about a restaurant without stopping. Eye-gaze areas highlighted with blue circles.
In addition to using smartphones to access nearby places, some participants used their smartphones to explore remote places and events. For example, when Ral walked past a 7-Eleven convenience store, he glanced at the store (Figure 11, 1) but kept walking (2). Although he did not stop walking, he quickly pulled out his phone and searched “wawa near me” in Google (3). The app showed him a map and a list of three Wawa convenience stores. He scrolled down to check all three locations. He later explained this sequence of actions: “So, I have passed a 7-Eleven. I’m not a huge fan of 7-Eleven. So, I was just curious to see if there's any Wawas nearby—for reasons I grew up back home with a Wawa.”

Ral searching for Wawa. Eye gazes are highlighted with blue circles.
In a similar case, when Lindsey was walking in a busy, commercial area, she took out her phone and searched on Google for “running shoes.” The app showed several images of Nike running shoes and some ads. Dissatisfied with the result, she added “rittenhouse square” to the search keywords. This time, her phone showed a map and a list of relevant shops nearby, which included “Philadelphia Runner,” “New Balance,” and “Lululemon.” She tapped on “Philadelphia Runner” and carefully examined the description of the store while continuing to walk. Later, she explained her behavior in the interview: Because I’ve seen Modell's (a sporting-goods store) at the corner of my eye. So, I wanted to see where there was a legitimate runner store . . . . Really, I was trying to remember the name of the store. I knew specifically there was a runner store, like a technical runner store. But I don’t remember the name of the store. I remember the general vicinity of the store.
Figure 12 shows this sequence of actions. Lindsey saw the sporting-goods store (1), walked for three seconds (2), and then decided to search for the running shoes store while walking (3).

Lindsey searching for running shoes stores. Eye gazes are highlighted using blue circles.
It may have become evident that using the smartphone to access nearby or remote store fronts was not simply triggered by environmental cues. The individual's personal interest and experience played important roles. This is evident in an instance that is illustrated in Figure 13. Chad, a medical student, first walked past “Yogorino,” a frozen yogurt shop (1). Seeing that the shop was not open, he kept on walking, during which he visited both the Chinese restaurant and T-Mobile (2). Sixteen minutes later, standing next to T-Mobile, he noticed “Sweet Café” across the street (3a). He said: “It says ‘Sweet Café,’ which might have frozen yogurt.” He took out his phone and searched “frozen yogurt” in Yelp (3b). Both Sweet Café and Yogorino appeared on the screen. He read a few reviews of both places while saying: “Yogorino … Oh! It's got pretty good reviews … Looks pretty good … ‘Hands down the best frozen yogurt I’ve ever had.’ I plan to head back there later.” Later during the data review session, he reflected on this sequence of actions: I had seen the other frozen yogurt place (Yogorino) earlier and I was going to compare the two and see if there were any other places around that might be different. And the Sweet Café, the aesthetic of it was kind of generic and didn't seem to be like something I would be interested in. It was more like a commercial frozen yogurt, not really pertinent to the neighborhood. Whereas Yogorino was more unique, artisanal, which appealed to me. So, I guess I just wanted to see what else there was around before committing to just walk into a random place.

Chad looking for frozen yogurt.
Although Sweet Café was a serendipitous discovery that pertained to Chad's interest, using the smartphone to compare it to a remote place that he had previously encountered reduced the possibility of his engagement with a “generic” dessert shop, which—to him—did not fit into the ideal image of the Rittenhouse Square neighborhood. In the previous three cases, the use of a smartphone was paradoxically connected to a city resident embracing one serendipitous discovery but rejecting another.
Discussion and conclusion
Drawing on Goffman (1963, 1971) and existing theories of mobile locative media (Farman, 2012; Holton, 2019; Licoppe, 2016), this study examined the various ways smartphone apps are associated with reconfiguring the space–time relationships between a pedestrian and their surroundings. Findings from this study show that the in-situ actions of participants were shaped by a complex, contingent, and intertwined set of cognitive, habitual, experiential, environmental, and digital inputs. Various uses of smartphone apps could enable the individual simultaneously to traverse the urban space while engaging with local places. This was achieved through various “unfolding” practices (Licoppe, 2016). Further, findings from this study suggest that individuals do not simply unfold their smartphone apps to compare two different versions of the “here and now.” Rather, they need to manage multiple versions of the “here and now” that are shaped by various sensory inputs, memories/lived experiences, spatial knowledge, urban design, habits, personal interests, and preferences, as well as digital information. The individual, therefore, often needs to make sense of the discrepancies between these different versions of surroundings. The study also uncovered the paradoxical effect of smartphone use, wherein it simultaneously introduced elements of serendipity and uncertainty as well as familiarity and certainty into a pedestrian's journey through public spaces. These findings extend Licoppe's (2016) observation that digital mobilities reshaped the way in which city pedestrians managed engagement, disengagement, and detachment (“the right to preserve one's tranquility”, p. 114) in urban public spaces.
In the past decade, the mainstream mobile media research has largely abandoned the “zero-sum” view of mobile-phone-mediated interactions and actions in public spaces. Instead, the present paradigm recognizes the new opportunities that smartphones have introduced to existing actions and interactions pertaining to local places and spaces (Campbell, 2019). This study contributes to this research program by providing a comprehensive analysis of smartphone uses within an intricate setting, closely mirroring the real-life environments urban dwellers encounter in their daily lives. The insights gleaned from this investigation suggest that mobile media research may stand to gain from transcending the confines of the interplay between on-screen and off-screen elements. Moores (2017) posited that media uses represent merely a single facet of place-making practices, coexisting among a multitude of others. Consequently, he championed a non-media-centric approach to media studies, encompassing a holistic perspective on the interconnections between media and space. Similarly, although it may appear counterintuitive to decenter mobile media within mobile media research, this approach could potentially unveil a more intricate interplay between minds, bodies, places, and mobilities.
A related methodological implication of this study is that mobile media research could benefit from embracing “messy” and “noisy” data generated in less controlled field studies. In the current study, many “noises” were permitted into the data by giving the participants freedom to determine how or whether they should use their phones. However, this research design yielded data of high ecological validity. This study also captured multi-sensory inputs ranging from eye gazes, audio, think-aloud data, and participants’ own reflections of their actions. Many ethnographic studies in the mobile media literature have been influenced by phenomenology and ethnomethodology, in which the embodied, multi-sensory experience is emphasized. However, video data collected using body cameras or eye-glass cameras (which are often used in these studies) may not offer as accurate insights into the participants’ perspective as eye-tracking devices would. Although far from perfect, the think-aloud method provided this study with valuable data regarding sensory inputs beyond sight and sound. The amount of information and noise embedded in the dataset presented tremendous challenges, both for data collection and data analysis, but they also yielded exciting findings.
Footnotes
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
Author biography
Weixu Lu is an assistant professor in the Department of Communication Studies at the University of Wisconsin–La Crosse. His research interest is digital media use and social networks.
