Abstract
Research on police stop and search has traditionally focused on who is targeted, with less attention to how encounters are conducted and experienced. Drawing on procedural justice theory, this study examines the quality of police behaviour during stop and search encounters and the extent to which such behaviour varies across contexts and audiences. Using systematic analysis of 140 stop and searches captured on police body-worn video in England, the study makes two key contributions. First, it compares levels of procedural justice – as assessed by police raters – across high-crime and low-crime areas. Second, it offers a rare direct comparison of how police supervisors and members of the public evaluate the same encounters. The findings show that officers in low-crime areas exhibit significantly higher levels of procedural justice, particularly in relation to neutrality, respect, and trustworthy motives, while no significant difference is observed for voice. In addition, police supervisors rate encounters more critically than members of the public, especially on voice and trustworthy motives. This suggests that different groups apply distinct evaluative frameworks when judging police behaviour. However, both groups produced similar ratings on respect and neutrality, indicating agreement on these dimensions. These findings reveal procedural justice as a context-dependent and strategically enacted feature of police–citizen encounters.
Introduction
The study examines the extent to which police officers adhere to procedural justice principles during stop and search encounters. While reforms have strengthened regulatory frameworks and begun to address disproportionality in police use of stop and search (Home Office, 2021), far less attention has been paid to the quality of these encounters and to the behaviours officers display when interacting with the public (Sherman, 2013). This gap is consequential: research consistently shows that the character of police–citizen interactions is a principal mechanism through which trust and legitimacy are built (Tyler, 2006). Stop and search interactions, therefore, represent not only moments of coercive authority (Epp et al., 2014) but also important sites in which legitimacy is enacted, negotiated, and potentially undermined.
Procedural justice provides a framework for understanding these dynamics. The theory posits that people are more likely to view the police as legitimate when they are treated with dignity and respect, when decisions are not based on facts rather than prejudice, when police demonstrate trustworthy motives, and when individuals are offered a meaningful opportunity to express their views (Sunshine and Tyler, 2003; Tyler, 2006). Previous police procedural justice influences trust in the police, willingness to comply with the law, and to cooperate with the police (Bradford, 2017; Tyler et al., 2014). Research on procedural justice is moving beyond attitudinal surveys to examine how these principles are enacted in real-world encounters, including systematic social observations (Jonathan-Zamir et al., 2015; McCluskey et al., 2019; Worden and McLean, 2017) and the analysis of body-worn video (BWV) footage (see Barber and Tankebe, 2024; Nawaz and Tankebe, 2018; Voigt et al., 2017). These approaches have provided important insights into the behavioural aspects of procedural justice. However, important gaps remain.
First, there is a limited understanding of how procedural justice varies across different policing contexts. Encounters do not occur in a vacuum: officers operate under varying levels of risk, time pressure, and situational constraints. It remains unclear whether and how these contextual factors shape the ability or willingness of officers to enact procedural justice. In particular, high-crime environments, often characterised by heightened threat, frequent exposure to disorder, and greater demands for control, may systematically alter the nature of police–citizen interactions. Yet empirical evidence on variation in procedural justice remains scarce (see Barber and Tankebe, 2024; McCluskey et al., 2019; Worden and McLean, 2017).
Second, existing research has paid insufficient attention to the relationship between observed police behaviour and its interpretation by different audiences. While studies have examined public perceptions (see McCluskey et al., 2019; Worden and McLean, 2017) and, separately, officer conduct, far fewer have directly compared how the same encounters are evaluated by different groups (Waddington et al., 2015), including comparisons of the perceptions of the same encounters by police officers versus the public (Bates et al., 2015). This is an important omission because police and citizens may apply distinct evaluative frameworks: officers may prioritise legality, control, and safety while members of the public may place greater weight on fairness, respect, and voice. Divergences between these frameworks may generate systematic misunderstandings with implications for trust and legitimacy (Bottoms and Tankebe, 2012; Jonathan-Zamir and Harpaz, 2014).
This study addresses these gaps using data from 140 stop and search encounters recorded on police body-worn cameras in England. It makes two primary contributions. First, it compares the level of procedural justice exhibited by officers across high-crime and low-crime contexts. This provides new evidence on the situational contingencies of procedurally just policing. Second, it offers a rare direct comparison of how police supervisors and members of the public evaluate the same encounters. This enables an empirical assessment of whether and how these groups differ in their judgements of police behaviour. By combining systematic virtual observation with comparative evaluation, the study advances understanding of procedural justice as both a behavioural practice and a socially interpreted phenomenon. In doing so, we shed new light on the conditions under which procedurally just policing is more or less likely to occur, and on the extent to which police and public expectations of “fair” policing align or diverge.
Procedural justice theory
Procedural justice theory provides a foundational framework for understanding how people evaluate the exercise authority. At its core, the theory argues that individuals care deeply about the quality of their interactions with police officers. While they care about the outcome of these interactions, they pay particular attention to matters that shape their views of the police legitimacy. These include whether officers treat them with dignity and respect, make unbiased and transparent decisions, demonstrate trustworthy motives, and allow individuals meaningful opportunities to express their views (Tyler, 2006; Tyler and Meares, 2019). These evaluations influence feelings of obligation to obey, which, in turn, shape legal behaviours such as reporting crimes to the police and complying with the law (Sunshine and Tyler, 2003; Tyler, 1998). Procedural justice is a multidimensional judgement that comprises four interconnected elements: respect, trustworthy motives, neutrality, and voice (Tyler, 2017).
The voice element concerns people’s participation in police decision-making. People expect police officers to listen to what they have to say, even if this does not sway the outcome of the interaction. By providing opportunities for people to tell their side of the story, police officers do more than merely listen. They convey symbolic messages about group members and recognition of the individual. Hence, there is evidence that people expect active rather than passive listening (Waddington et al., 2015). Positive interactions could include an officer asking someone to explain events before searching them, asking if they have any questions, or making requests rather than ordering or instructing them to do something (Jonathan-Zamir et al., 2015).
Treating people with respect is another dimension of procedural justice. Research consistently finds that the public reacts negatively when officers are dismissive or demeaning (Murphy and Barkworth, 2014). Negative indicators include using a loud voice, interrupting, and making critical or condescending comments (Jonathan-Zamir et al., 2015). Instead, treating people respectfully conveys a sense of value and social worth (Tyler, 2004), which is especially important when using such an intrusive power as stop-search.
Trustworthy motives concern police care for the wellbeing of citizens. According to Tyler and Meares (2019: 74–75), people want to believe that the police are ‘sincere and benevolent, focused on the needs and concerns of the public, and willing to acknowledge and address people’s concerns’. Examples include being patient and comforting during a stressful encounter, such as a stop and search. Trustworthy motives also concern how police justify their power to civilians. For example, suppose the police increase stop and search in an area due to rising knife crimes. In that case, officers should be forthright, explaining the reasons for the increased enforcement and emphasising that removing knives from the streets is in people’s best interests.
By acting procedurally justly, police officers convey a sense of belonging and self-worth to individuals (Tyler, 2006), fostering values that motivate people to cooperate and obey the law. Conversely, unjust interactions may reinforce distrust or generate feelings of exclusion and resentment which may delegitimise the police (Tyler, 2011).
A growing number of procedural justice studies have moved beyond survey-based data to systematic observations. Thus, Jonathan-Zamir et al. (2015) assigned trained observers to patrol with officers to observe their interactions with the public. Between June and December 2011, four observers watched over 200 police–public interactions in a small American city. They then calculated a procedural justice score for each encounter. While not all encounters involved searches, the findings supported previous research, showing that procedural justice increases satisfaction and cooperation with the police.
While direct observation provides a more reliable measure of officer behaviour, the Hawthorne effect (the observer’s presence causing individuals to adjust their behaviour) is a significant limitation (Adair, 1984). Instead, more recent research has used officer BWV to observe stop–search encounters virtually. Nawaz and Tankebe (2018) tracked the four critical dimensions of procedural justice across 100 searches recorded on BWV in Greater Manchester (UK). While most officers scored highly on dignity and respect, they failed to display trustworthy motives. The value of this study lies in the fact that, for the first time, it enabled researchers to accurately measure procedural justice elements without being physically present and potentially influencing the interaction.
In an innovative contribution to the literature, Waddington et al. (2015) showed the same video of a real-life encounter between officers and a suspected car thief to focus groups. Some groups rated the officers positively, while others found them insulting and demeaning. Similarly, Worden and McLean (2017) observed officers interacting with the public and measured their adherence to procedural justice. When they compared the scores with the public’s perception of their treatment (measured using a survey), they found only a weak relationship between how people felt they were treated and how officers behaved. This lack of alignment is problematic for the police, as it seems there is no simple recipe for developing goodwill and legitimacy within all communities.
In addition to individual differences, social groups and population segments may also hold contradictory views. These differences are especially relevant to police culture, as studies have shown that officers often perceive themselves as a distinct and powerful social group, which can influence how they interpret events and interact with other groups (Herbert, 2006). For example, Meares et al. (2015) contrast the ways in which the police and the public interpret procedural justice. They showed participants real-life videos of officers interacting with the public. The public judged the officers based on how fair and respectful they were. By contrast, police officers tend to be more concerned with legal issues, such as the grounds for the search and whether the force used was proportionate (Worden and McLean, 2014).
Method
The data for the study were stop–search encounters captured on police BWV. Body-worn cameras are small devices worn on the officer’s chest that capture high-quality video and audio. Although not intended for research, they allow researchers to observe officers in a natural setting. Studies have used BWV to examine issues such as police use of force (Ariel et al., 2015) and procedural justice of stop and search interactions (Barber and Tankebe, 2024). The data for the present study come from the Dorset Police Constabulary in the United Kingdom. Dorset introduced cameras in 2016, with a policy that requires officers to record all encounters where they exercise policing powers. Therefore, as soon as an officer decides to search an individual, they must begin filming. This could be when they first detain a person or partway through an encounter. The footage is automatically uploaded to the Digital Evidence Management System (DEMS) cloud database. Supervisors regularly check that officers are using their cameras.
The Force data-sharing agreement only permitted the release of footage relating to the search, not footage from before or after. For example, it was permissible to show the officer explaining the grounds for the search, but not the subsequent events if the officer arrested the person. Therefore, the video clips were edited from the moment the officer informed the person they were being detained until they were released. This is a potential limitation, as the end of the search may not be the end of the encounter, and consequently, assessors may miss elements of procedural justice.
The sample was limited to stop–searches carried out in Dorset from 1 January 2021 to 31 December 2021. A total of 848 stop–search video clips. They comprised 227 from high-crime areas and 621 from low-crime areas. To check reliability, assessors watched two searches per officer. Therefore, the sample was filtered to show only officers who had conducted two or more searches. This resulted in 104 officers from the low-crime area and 41 from the high-crime area. Next, 35 officers were randomly selected from each area. Two searches per officer were randomly selected to obtain a final sample of 140 videos: 70 from high-crime areas and 70 from low-crime areas. To avoid contamination, both lists were checked to ensure that no officers appeared in both groups.
Most searches were on males (85.7%), and the most common reason was for drugs (62.2%). Interestingly, these findings align with statistics for England and Wales in 2021 (90.5% males; 68.9% drug-related – Home Office, 2021). Again, in line with national data, only a small proportion of searches led to an arrest (20.7%), with the majority (53.6%) resulting in no further action. Nationally, 11% of stop–searches lead to arrests, while 77% result in no further action (Home Office, 2021). Reflecting Dorset Police’s gender representation (70.0% male and 30% female), male officers did the most searches (64.4%). Most searches also occurred at night (56.4%).
Key measures
Procedural justice
Assessors watched each search and then scored the officer using a standardised procedural justice scoring scheme developed by Nawaz and Tankebe (2018). The scheme comprises 25 questions that cover the four dimensions of procedural justice. Marks for each dimension are then combined to produce an overall score ranging from 0 to 25. Assessors also had the opportunity to record general feedback about the interaction. The advantages of the scoring scheme are that it is a validated measure (content validity), is easy to administer (reliability and replicability), and is the same measure used in previous studies (Barber and Tankebe, 2024; Nawaz and Tankebe, 2018), allowing direct comparisons between studies. Table 1 shows the scoring scheme.
Procedural justice scoring scheme.
Location of the search
The research compared procedural justice ratings between high- and low-crime areas. One of the key objectives of stop–search is to tackle public place violence and knife crime. Dorset has 452 local super output areas (LSOAs) – small geographical areas of similar population size – and crime figures for the past 5 years show that violent crime concentrates in just three LSOAs (two in Bournemouth and one in Weymouth). As these areas consistently account for most violence, they represent high crime. Low-crime areas were the remaining 449 LSOAs (Table 2).
Frequency of variables within the sample (N = 140).
Coders
Two groups of coders coded the videos for procedural justice. The first is police coders, comprising four police supervisors who each coded two searches in high- and low-crime areas (140 searches in total). Each search lasted about 20 minutes, making this a substantial commitment. The supervisors were volunteers recruited via an internal advert. They had between 10 and 21 years of service. Initially, to avoid any preconceptions or bias, supervisors were excluded if they knew any of the officers in the sample. Dorset, however, is a small Force, making this impossible. Therefore, this was changed to only exclude supervisors who directly managed one of the searching officers. The supervisors were asked to read a briefing document outlining the study’s purpose, an overview of stop and search, a description of the four elements of procedural justice, and how to use the scoring scheme. They then scored their 70 searches independently, yielding two scores per search. The mean of these scores was the overall procedural justice rating.
Public coders, comprising 14 individuals recruited through Community and Independent Advisory Groups, rated only high-crime videos. There were six males and eight females, aged between 18 and 52 years. Eleven of the panel were white British, all were employed or in education, and only one had previously been stopped and searched or arrested. Volunteers received identical training as the police supervisors. Dorset police also required that all volunteers be vetted and sign a confidentiality agreement. People with convictions suggesting they could misuse confidential information were excluded. The Force stipulated that viewings must be conducted at a police station under the supervision of a police officer, making the process time-intensive. Consequently, it was impossible to show the entire sample (all 140 videos) to the public; the public therefore rated only videos from the high-crime area. The high-crime videos were chosen because searches in these areas generated the most public complaints. Each member of the public independently scored 10 high-crime videos (a total of 140). Each search took approximately 10 minutes to score.
Findings
Comparing procedural justice behaviours in high-crime and low-crime areas
We start our analysis by comparing the levels of procedural justice behaviours displayed by officers in high-crime and low-crime areas. This comparison is based solely on ratings by police officers. Mann–Whitney U tests were run to compare the differences (see Table 3). First, a Mann–Whitney test indicated that officers in the low-crime areas displayed significantly higher levels of trustworthy motives (mean rank = 88.64) compared with those in the high-crime areas (mean rank 52.36), U = 1180.50, p < .001, r = .49. They also scored higher for neutrality (mean rank = 83.69) than their counterparts in high crime areas (mean rank = 57.31), U = 1879.00, p < .001, r = .271; officers in low-crime areas were also more likely to show dignity and respect (mean rank = 78.66) than those in high crime (mean rank = 62.34), U = 1879.00, p < .001, r = .271. However, we found no evidence of a difference in voice ratings: that is, the opportunities officers offered the public in high-crime and low-crime areas to tell their side of the story did not differ statistically. Finally, the overall procedural justice scores were higher for low-crime officers (mean rank = 86.11) than for high-crime officers (mean rank = 54.89), U = 1357.00, p < .001, r = .387.
Mann–Whitney U test comparing procedural justice in high- and low-crime areas.
**p< .001; n = 70 per group.
The discussion that follows provides detailed qualitative insights into how raters believed officers demonstrated each procedural justice component across low- and high-crime areas. All descriptive patterns reported here are based solely on police-provided ratings. We draw on illustrative examples selected but the rates to justify their scoring decisions, and the first author systematically reviewed the highest and lowest scoring encounters to deepen our understanding of the reasons driving these assessments.
Voice
Voice was scored on a scale from 0 to 7. The data showed that 70% of searches were rated 5 or higher, with a mean score of 5.73 (SD = 1.131). In the highest-scoring searches, officers were assessed to have conveyed their voice by explaining events clearly, allowing individuals to provide their accounts before searching them, and making requests rather than ordering or instructing people to do something. For example, in a high-crime area, officers responded to reports of drug dealing at a train station:
Hello, are you alright?
Yeah . . .
What’s brought you here [to Dorset]?
The male explained that he had come from London on the train to do some sightseeing.
We’ve had some information that you might be a County Lines drug dealer?
No! No, not me!
Well, I’m going to have to give you a quick search. Is that alright?
Yes, of course!
In another high-voice encounter, an individual had a knife. Initially, the officer was assertive and dominating, drawing his Taser, and ordering the male to lie down with his hands above his head. However, he immediately changed his tone once he had handcuffed him. He calmly asked what had happened and made polite requests such as “please stand up” and “please come over here so I can search you.” As a result, while still in control, the officer softened his commands, building procedural justice and quickly de-escalating the situation.
One search scored just two. The individual had a weapon, and consequently, the officers used force (justifiably) to restrain him. However, once they had gained control, unlike in the above example, the officers remained excessively dominating and transactional, shouting at the suspect to “shut up and stop whining!” and telling him, “That’s what happens if you carry knives!.” As a result, the individual remained aggressively resistant.
Often individuals were anxious or upset and consequently unwilling to engage with the officers. The highest-scoring officers could adapt their communication, remove barriers, and encourage the person to open up. For example, officers stopped a youth who they thought had drugs. Initially, the individual was very anti-police, telling the officers that he had done nothing wrong and that they were picking on him. He was becoming increasingly upset, but the officer remained calm. He gave the youth time and space to speak freely. He asked many good-humoured questions about his lifestyle and circumstances and showed genuine interest, which helped build a rapport. The youth quickly calmed down, and the encounter ended positively.
Two clips from the low-crime area illustrate these contrary approaches. In one clip, an officer detained a male on the beach whom he suspected to have a knife. The male was unhappy and challenged the officer by asking multiple questions:
What have I done? I literally just got here. On my mum’s life!
The officer gave a detailed explanation of the grounds:
We have been called by someone who says you may be carrying a machete.
I’m being arrested then?
No, you’re just being detained for a search.
Why are you stopping me out of the hundreds of people here?
You match the description of the person.
These questions were a direct challenge to the officer and a clear message that the male did not accept his claim to authority. However, the officer replied with complete answers, and each time, he stopped and invited the male to ask questions: “Are you happy with the grounds?” This legitimised the encounter, and the male consented to the search without further complaint.
By contrast, during another search, a male asked the officer for additional information to explain why he had been stopped: “What have I done wrong?”; “What power have you got to question me?” This time the officer shut the individual down, saying he had already told him and if he was unhappy, he could complain. The officer’s unwillingness to answer questions may have reflected his desire to maintain control; however, his short and abrupt reply rapidly escalated the encounter into a physical confrontation. These findings support previous research that shows when officers implement procedural justice, the encounter is less likely to lead to the use of force (Wood et al., 2020).
Neutrality
Scores range from 0 to 9. The Force recently provided refresher training on neutrality, which may account for the high scores: nearly two-thirds of searches (57.4%) scored 8, with a mean of 7.16 (SD = 1.657). The raters scored officers highly when they were transparent and carefully explained the grounds for searching an individual (i.e. the reason for the search) and what they were looking for (i.e. the object of the search). A positive example involved the search of a male suspected of trying to break into cars at 02:00 hours:
I’m just going to put some cuffs on you.
You ain’t cuffing me for shit! For what reason? Don’t you dare think you’re grabbing me (pushes the officer backwards).
The male continued to shout and swear at the officer:
Listen to me, please. There’s been an allegation a vehicle has been broken into with a pole, so I’m going to search you.
The male continued to say he was only walking home and that it was unfair to target him. The officer again explained the grounds for the stop, emphasising how he matched the description. By conveying objective and unbiased decision-making, the officer legitimised the encounter and the male complied with the search:
Go on then, search me . . . I’ll be quiet for you!
In another example, an officer successfully used neutrality and objective decision-making to calm a violent male. In the high-crime area, nightclub door staff detained a male believing that he had a knife in his pocket. When officers arrived, the male was angry and aggressive. He denied having a knife and stated the door staff were picking on him and mistreating him:
The witness said you threatened to stab them. So, to make sure everyone is safe, I am detaining you so I can search you for a knife.
I ain’t got no knife!
Look, I need to check.
The male resisted the officer’s initial attempts to claim authority, but the officer continued to negotiate, revising his claim by explaining that a witness had seen him with a knife. By doing this, the officer emphasised his neutrality, showing that he acted on evidence rather than his personal beliefs:
Look, I know it’s rubbish, but I just need to check.
OK. At least you’re treating me like a human. Not like those pricks (pointing at door staff)!
I’ll comply with you because you’re treating me like a human and not a c**t!
A small number of searches scored under 4 (n = 9, 6.4%). In these cases, the officers provided either vague or no grounds for the search and did not inform the person of its outcome. For example, officers stopped a male walking down the road late at night:
Why are you searching me?
I’ve seen you go that way, then you saw me and went that way. So, in my head you were trying to avoid me.
I wasn’t, you’ve got that wrong You’re chatting rubbish!
And when you’re speaking your eyes are glazed and you’re slurring your words!
No I’m not!
Dignity and respect
The element of “dignity and respect” was scored from 0 to 5. The mean score in the sample was 4.74 (SD = 0.696), with over three-quarter (77.1%) of searches scoring the highest mark of 5. This indicates that officers displayed high levels of this dimension
Positive examples included officers seeking the person’s cooperation (“Please, can you get out of the car?”) as opposed to compelling them to do something (“Take off your coat!”). Using dignity and respect, the officers maintained control while appearing less threatening. There was also strong evidence of officers remaining polite and professional even when the individual was confrontational or agitated. For example, using their name or persuading and negotiating rather than using physical force:
Just leave me alone! I haven’t done anything!
Come on Charlie. I’m just worried about you.
I told you . . . just leave me alone (aggressively shouting at the officer).
Look, I can’t leave you in this state [intoxicated]. Tell me where you live, and we can get you home safely.
However, there were examples of officers conveying disrespect to punish poor behaviour. For example, four searches scored under three, and in each case, the individual was either verbally or physically abusive, and the officer retaliated with negative or belittling comments. For instance, a male managed to escape during a search. The officer chased him, and when he eventually caught him, he was angry and used demeaning and hostile language, calling him a “coward” and a “stupid child.”
Trustworthy motives
The concept of trustworthy motives concerns whether officers are forthright as to why they are searching an individual and able to justify their actions in ways that demonstrate care and concern. Possible scores ranged from 0 to 4, and it was the lowest-scoring dimension, with a mean of 2.90 (SD = 0.659): In the highest-scoring examples, officers were patient and comforting, asking about the person’s well-being or offering practical advice, for example, about how to stay safe. A positive example came during the search of a male thought to have a knife. At the start, the male was agitated and aggressive towards the officer. Citizen: You’re accusing me of things I haven’t done. I’ve done nothing wrong. Don’t do this to me again [stop-search]! Don’t surround me!!
The officer built a good rapport by using a calm voice and calling him by his first name. The officer justified her actions by carefully explaining the reason for the search and pointing out that it was her role to keep the public safe. She also showed genuine concern, asking about his welfare, which led to him disclosing his mental health problems. Finally, the officer arranged some immediate support, and the encounter ended positively:
Look, listen . . . Thanks. And look, I’m sorry about everything (shakes the officer’s hand).
Other positive indicators included officers justifying their actions at the start of the process. For example, an officer detained a male seen with a knife in a pub and used trustworthy motives to de-escalate the encounter:
Those people are saying you’ve got a knife, so for everyone’s safety, I need to check if you’ve got a knife.
The officer continued to negotiate with the male and minimise the search’s impact, even suggesting moving into a doorway out of sight of passers-by:
OK, I understand now. I’ll be nice.
Conversely, in a poor example, a group of teenagers were upset at being stop–searched and accused the officer of picking on them because of their age. The officer gave no credible reason for the search. He did not attempt to de-escalate the encounter, he made no effort to justify his actions, and his word choice did not convey any care or concern whatsoever:
Listen you . . ., you shut it now. You’re really starting to get on my nerves!
The officer’s tone and body language were authoritarian and belittling. When the youths again refused the officer’s directives, the encounter intensified, and any remaining respect for the officer quickly dissolved, forcing him to physically restrain one of the youths.
Comparing police and public ratings of procedural justice
The next goal was to compare police and public ratings of procedural justice. Members of the public and police supervisors rated the same 70 searches (all from the high-crime area). Mann–Whitney U tests were run to compare the differences (see Table 4). The results show that even when viewing the same set of interactions, the police and the public differ significantly in their ratings of procedural justice.
Mann–Whitney U test comparing public and police assessments of procedural justice.
p < .001; n = 70 per group.
First, the public rated overall procedural justice significantly higher (mean rank 83.23) than the police (mean = 57.77), U = 1559.0, p < .001, r = .32. There were also significant differences in two of the four individual dimensions (voice, trustworthy motives). Voice had a small effect size, with ratings for the public being significantly higher (mean rank = 82.09) than for the police (mean rank = 58.91), U = 1639.0, p = 0.01, r = .29, indicating that the public perceived the officers to be better at listening and allowing individuals to express their concerns.
A high-crime video highlights these contrasting views. Officers detained a rowdy group suspected of shoplifting. The public assessors scored the voice dimension positively, with one commenting, “Individuals allowed to speak, and officers remained professional and respectful despite the chaotic situation.” The police assessors, however, took the opposite view, commenting, “Total lack of control of the subjects . . . subjects allowed to shout and swear in broad daylight with members of the public present.” Therefore, the public perceived the behaviour as positive (giving people space to express their views), but the police assessors saw it as negative (failing to control the group).
The public also judged that the officers displayed higher levels of trustworthy motive (large effect size) during the searches (public mean rank = 92.27, police mean rank = 48.73; U = 926.0, p < .001, r = .55). For example, in another high-crime video, officers detained a youth for a drug search. Again, the public scored trustworthy motives high, commenting, “the officers were supportive and offered the boy advice about his drug habit.” However, the police assessors took a very different view. Instead, they focused on the officer’s safety, remarking, “the officer had his hands in his pocket, and the male should have been handcuffed.”
The scores for neutrality and respect did not differ significantly, indicating that the police and public similarly judged these dimensions.
Discussion
This study sought to examine how procedural justice is enacted in stop and search encounters and how these encounters are evaluated by different audiences. Three key findings emerged. First, officers were rated significantly lower on levels of procedural justice in high-crime areas compared with low-crime areas. One explanation lies in the institutional constraints associated with high-crime settings. These environments are often characterised by heightened risk, time pressure, and repeated exposure to non-compliant or intoxicated individuals. Under such conditions, officers may prioritise control, efficiency, and safety over the relational dimensions of encounters. This aligns with prior work suggesting that procedural justice may be more difficult to sustain in contexts where compliance is uncertain and threats are perceived to be higher (Worden and McLean, 2017).
However, contextual constraints alone are unlikely to fully explain the observed differences. The findings are also consistent with research on police culture and categorisation, which suggests that officers may develop differentiated expectations of individuals and places. For instance, Bowling et al. (2019) differentiate between higher-status criminals who are seen as worthy and deserving of respect (good-class villains) and those who waste police time or make nuisance calls (rubbish). High-crime areas may be associated with particular “working rules” or cognitive shortcuts that shape how officers interpret behaviour and allocate attention and respect. Consequently, lower levels of procedural justice may reflect not only situational pressures but also culturally embedded perceptions of who deserves fair treatment.
Importantly, the absence of a significant difference in the voice dimension complicates a purely constraint-based account. Opportunities for individuals to speak and be heard appeared relatively stable across contexts, even where other elements of procedural justice declined. This suggests that some components of procedural justice may be more resilient or easier to maintain under pressure than others. It also raises the possibility that voice is less resource-intensive or less in tension with officer safety than dimensions such as trustworthy motives, which require officers to actively communicate care, empathy, and justification. However, the difference between high-crime and low-crime areas on the trustworthy dimension could also be due to compassion fatigue. Compassion fatigue refers to an “acute onset of physical and emotional responses that culminate in a decrease in compassionate feelings towards others because of an individual’s occupation” (Sinclair et al., 2017: 10). Officers in the high-crime areas frequently work with traumatised victims or deal with violent confrontations. This may make it harder for them to empathise with others, inhibiting their ability to display trustworthy motives. Future research could explore this by interviewing these officers to understand their stress and trauma levels and then exploring ways to treat compassion fatigue. For example, while there has been little research on police officers, studies have shown that mindfulness, relaxation, and social support can reduce compassion fatigue in medical professionals, increasing their wellbeing and improving the service they provide to the public (Pfaff et al., 2017).
The second key finding concerns procedural justice as an interactional resource that officers deploy strategically in managing encounters. This does not imply cynicism or manipulation but recognition that procedural justice is a contingent interactional resource. The qualitative analysis demonstrates that officers use elements such as respect, explanation, and voice in dynamic and contingent ways to secure compliance, de-escalate conflict, and maintain control. In some cases, dignity and respect were extended as a form of reward for cooperative behaviour; for example: “You’re being decent with me, and if you stay like that, I won’t handcuff you,” or “If you keep calm, I’ll do this search as quickly as possible.” In others, they were withheld or withdrawn in response to resistance. For example, an individual refused to get out of his car. The officer told him they could search the “easy and nice way” if he got out, but if he refused, he would be handcuffed and taken to the station. These findings are similar to those of Savigar-Shaw et al. (2022). Studying arrestees in police custody, they found that officers also used dignity and respect as either a reward or a threat; for example, they allowed compliant arrestees to take personal belongings into their cells but threatened to remove privileges from others who misbehaved.
These patterns suggest that procedural justice is not merely a static attribute of encounters but an ongoing accomplishment that is shaped through moment-to-moment interaction. This finding has important implications for theory. Much of the procedural justice literature conceptualises fair treatment as a relatively fixed quality that can be measured and improved through training and policy reform (Weisburd et al., 2022). By contrast, the present study indicates that procedural justice may be deeply embedded in the interactional dynamics of police work. Officers do not simply “apply” procedural justice; they actively negotiate it in response to the behaviour of others and the demands of the situation. Recognising this dynamic character is essential for developing more realistic models of police–citizen encounters and for designing interventions that account for the complexities of frontline policing.
The third major finding concerns how procedural justice is evaluated. Contrary to expectations based on prior research (see, for example, Bates et al., 2015), police supervisors rated officers more critically than members of the public did when assessing the same encounters. This divergence was most pronounced for voice and trustworthy motives while ratings of neutrality and respect were broadly aligned. This finding challenges the common assumption that members of the public are consistently more demanding or critical of police behaviour than officers themselves. On the contrary, the findings suggest that different groups may apply distinct evaluative frameworks when judging encounters. Police supervisors appeared to focus more on issues of control, authority, and officer safety; for example, criticising situations in which individuals were allowed to speak freely or were not physically restrained. By contrast, public assessors tended to interpret the same behaviours as indicators of fairness, patience, and professionalism.
There are several possible explanations, the first relating to the assessors’ expectations around stop and search. Only one public assessor had been stop–searched. Therefore, the rest of the panel may have based their opinions on what they had heard in the media. Consequently, they possibly held a negative view of stop–search before their assessment and expected to find serious misconduct. However, Dorset Police only receive around 14 complaints relating to searches per year, most concerning minor incivility (Independent Office for Police Conduct, 2022). As a result, having watched the searches, many of the group commented that they were “a lot less interesting” than they expected or that the officers were “a lot nicer” than they thought. As the officers exceeded their initial expectations, this may have inflated their scores compared with the police supervisors who observe searches daily, manage complaints, and are responsible for setting and maintaining standards.
The second explanation relates to the assessors’ lack of pre-existing knowledge about the role of the police. Four of the panel were surprised by the enormous range of safeguarding activities the police carry out. For instance, officers regularly deal with mental health crises, missing persons, and vulnerable children, many of which result in a stop–search. In one example, officers were concerned about a vulnerable male’s welfare and ended up searching him and finding a knife. Similarly, officers were looking for a missing child, and when they found her, she had illegal drugs. These incidents are complex; therefore, the panel will likely see officers displaying more care and compassion than expected, which may have resulted in more generous scores. In particular, trustworthy motive relates to safeguarding and paying attention to people’s needs, which may explain why the public scored this dimension so highly.
A further explanation concerns the conditions under which member of the public conducted their assessments. The assessment was done inside a police station and under the observation of a serving officer. This could have discouraged participants from providing a more critical evaluation than they would have offered in a natural setting. By contrast, the police coders were supervisors, and it remains an open empirical question whether frontline officers, whose routines, expectations, and sensibilities differ markedly from those of supervisors, would evaluate the same encouragement in a way similar. Further research should examine whether assessments vary systematically across policing roles, as such differences may illuminate how organisational positions shape understandings of procedural justice.
Nonetheless, the differences reported in this study suggest that even when police officers believe they are acting appropriately, they may be evaluated differently by members of the public, who prioritise interpersonal treatment. At the same time, the relatively positive ratings from public assessors in this study suggest that expectations of police behaviour may be shaped by prior beliefs and limited direct experience. Where expectations are low, routine encounters may be judged more favourably than anticipated. More broadly, the findings imply that legitimacy is not determined solely by what officers do but also by how their actions are interpreted. This underpins the importance of examining both behaviour and perception within the same analytical framework rather than treating them as separate domains.
Limitations and conclusions
The study is not without limitations. First, all the public coders were volunteers recruited from local community groups. The data sharing agreement required security vetting for each person. Unfortunately, many people were unwilling or excluded because of prior convictions. Consequently, while the panel included a mix of genders, ages, and ethnicities, it is unlikely to have been entirely representative of the public, reducing the generalisability of the findings. The second limitation relates to the ability of the study to establish causation. This was an observational study with limited control over extraneous variables. As a result, it was only possible to establish associations between predictor variables and procedural justice, rather than to establish cause and effect. For example, we cannot be definitive about whether officers in high-crime areas are less procedurally just.
Notwithstanding these limitations, the study makes substantive contributions to the literature on police–citizen encounters. It demonstrates the value of analysing police behaviour as it unfolds in real-life settings rather than relying solely on public opinion surveys that dominate much of the procedural justice literature. By drawing on BWV, the study captures the dynamic moment-to-moment negotiations between officers and the people they search, which reveals how procedural justice is not merely a normative ideal but an interactional resource that officers strategically mobilise. The qualitative findings show, for instance, how dignity and respect may be offered as rewards for good behaviour (see also Savigar-Shaw et al., 2022), withheld as a threat or used tactically to stabilise volatile situations.
These are patterns that survey-based studies are structurally unable to detect. More broadly, the study shows that procedural justice is best understood not as a static quality of police encounters but as an ongoing process shaped by context, officer decision-making, and the behaviour of the individuals involved. By highlighting this interplay, the study deepens theoretical understanding and provides a stronger empirical basis for training and reforms aimed at improving police legitimacy. Given concerns about generalisability, we cautiously recommend that police departments introduce a performance measure to track the quality of procedural justice in stop-and-search encounters. This study shows that systematic tracking is feasible and can generate valuable insights. Routine procedural justice data would allow forces to monitor trends and identify areas needing improvement and enable departments to recognise and reward officers whose behaviour enhances legitimacy (Tyler, 2011).
Footnotes
Acknowledgements
The authors thank Dorset Police for access to data.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
