Abstract
Performance thresholds and minimum standards in prison have preoccupied policy makers and practitioners alike for some time. These standards are based on widely accepted statements of principle, but benchmarks are rarely set or explored empirically. Nor has there been any attempt to describe or define higher-end thresholds; the point at which outcomes become positive, or stated principles are achieved. In this study, we provide an empirical demonstration of how quality of life thresholds may be determined using data from 518 Measuring the Quality of Prison Life (MQPL) surveys conducted in prisons in England and Wales (2009–2020) and examine their relationship to five violence outcomes: serious prisoner on prisoner assaults, serious assaults on staff, self-harm incidents requiring hospital treatment, self-inflicted deaths, and homicides. The results suggested that thresholds exist for most of the MQPL dimensions. A set of lower ‘unsafe’ and higher ‘minimally safe’ thresholds were produced. We found that the scores of prisons below the lower threshold had a very strong relationship with each of our five serious forms of violence in prison. Similarly, prisons that did not manage to cross the ‘minimally safe’ threshold also had strong relationships with incidents of violence in their prison but were at slightly lower risk of those incidents occurring. Striking differences in mean incidents rates were found when comparing prisons below the lower threshold to prisons above the ‘minimally safe’ threshold. Our findings suggest that to operate a safe enough (and therefore legitimate) prison, a combination of harmony, security and professionalism dimensions above a certain threshold should be achieved.
Introduction
The focus 1 on ‘performance’ in the Prison Service, and elsewhere in the public sector, has increasingly sought to link what goes on in institutions such as prisons to desirable or undesirable outcomes. During the late 1980s and 1990s, policy-makers and senior managers followed trends set by new public management in most publicly-funded organisations, but the emphasis in practice has been primarily normative or relative – how does this institution perform in a league table? – rather than on capturing and understanding ‘what goes on’ in establishments and working out how to improve them. In the Prison Service in particular, although the organisation explicitly sought to define the values it was trying to attain (Butler and Drake, 2007) performance measurement was not grounded in any model of what a good enough prison might look like 2 or a research based approach to standard setting: on moral and empirical questions. Austerity has tended to place emphasis on (or bring about) a lowering of minimum standards (Hall et al., 2013; Ismail, 2019, 2020a,b). Our focus on setting thresholds empirically is new and might take us further in our understanding of prisons and their real, rather than officially aspired, effects as well as in the practice of standard setting. It might assist the prison service to learn from its own successes.
In his report into the prison disturbances in the Spring of 1990, Lord Justice Woolf argued that prisoners are legitimately entitled to expect a ‘basic threshold quality of life in prison’ (Home Office, 1991: 224–235). His analysis of the disturbances emphasised the three principles of security, control, and justice – each essential to the maintenance of order in prisons, or the securing of prisoners’ consent to the complex demands of prison regimes. Other jurisdictions have drawn similar conclusions (e.g. SPS, 1990). Several national and international bodies have attempted to operationalise, or expose breaches of, such minimum standards (see Sprott and Doob, 2020 on Structured Intervention Units in Canada, for example) in the interests of accountability and public safety. For example, European and International Prison Rules and Minimum Standards (e.g. the recently revised Mandela Rules) use terms like ‘inhuman and degrading’, focusing on lower end thresholds below which prison systems are found to breach international standards. 3 These standards refer to material conditions (size of cell, occupation, hours out of cell, and so on), prison regimes, health care, good order and staff–prisoner relationships, and preparing for release, and are based on widely accepted statements of principle. 4 Recently an attempt to operationalise the concept of ‘meaningful human contact’ has also been made (2 hours required: Sprott and Doob, 2020). Whilst penological experts contribute significantly to the setting of standards, they have not been set or explored empirically. Nor has there been any attempt to describe or define higher end thresholds: the point at which outcomes become positive, or stated principles are achieved. There is growing interest, however, in questions of how prisons should be monitored, how to determine what standards should apply (Rogan, 2021; van Zyl Smit and Slade, 2022) and, more generally, whether appropriate measures can be developed to hold penal and political systems more firmly to account (see, e.g. the Human Rights Measurement Initiative). 5
In this study, we provide an empirical demonstration of how quality of life thresholds may be determined using data from 518 prisons (2009–2020) in England and Wales and their relationship to several violence outcomes. Our motives in carrying out this work are to understand and evaluate the prison experience as accurately as possible, building on cumulative and well-grounded empirical research, and testing a new methodology. By conceptualising and measuring what matters most to prisoners (with the help of participants), exploring differences between prisons, and investigating links between moral quality measures and outcomes, we can gain important knowledge of the real (e.g. often harmful) rather than imagined (e.g. rehabilitative) effects of imprisonment. We also gain some understanding of what minimally safe, survivable and rehabilitative prison environments look like, and their rarity. This kind of understanding, whilst always partial and subject to error, could drive public policy decisions and operational practice in a more constructive direction. Prison climate research may have relevance to broader investigations about what causes or prevents violence more generally.
Measuring prison quality
Liebling assisted by Arnold (2004) attempted to identify ‘what matters most’ to prisoners in their experience of life in prison. What emerged transcended physical living conditions and material goods. Instead, ‘less easily quantifiable features of the prison experience’, and in particular, staff–prisoner relationships, fairness, safety, order, humanity, trust, and opportunities for personal development, arose as areas of prison life that differed, as well as mattered, most. Liebling assisted by Arnold (2004: 50) referred to these dimensions as constituting the ‘moral performance’ of a prison. The survey developed as a result of this research programme is called Measuring the Quality of Prison Life (MQPL).
Academic evaluations of prison quality have developed scales reflecting ‘correctional’, ‘custodial’ and ‘humanitarian’ goals (e.g. support, control and clarity, in the CIES; lawful, safe, industrious, and hopeful in (Dinitz, 1981: 11–16; humanity, respect and fairness in the MQPL). Official statements about aims in England and Wales have used the language of ‘humanity’ (in the Statement of Purpose) and ‘decency’ (usually incorporating the terms safe and legal). Liebling and colleagues (2011) proposed the use of ‘harmony’, ‘security’ and ‘professionalism’ clusters of dimensions as part of a broader ‘moral performance’ framework. Other conceptual frameworks with empirical support in the literature include ‘legitimacy’ and ‘good governance/failed state’ models of prison quality (Liebling, 2015).
The aim of this analysis is to test and revise existing frameworks, and to develop an improved empirically and theoretically derived conceptual model of prison quality, showing where ‘good enough’ and ‘unacceptably low’ limits can be found and what combinations of dimensions constitute a ‘legitimate’ prison (which should arguably be, at least, survivable, safe and constructive, and should encourage prisoners to lead law abiding lives on release). Legitimate, or morally intelligible, prisons should aim to uphold social order, creating ‘better neighbours’, improving civic character, and leading to better lives. Understanding the empirical differences between chaotic/damaging versus ‘legitimate’ prisons, which tend to have more positive outcomes, should help to inform public policy and practice (see Sparks, 1994; Auty and Liebling, 2020).
This study will explore ‘minimally safe’ and other thresholds of prison quality 6 , reanalysing already collected empirical data on the moral quality of prison life and linking these analyses to relevant in-prison outcomes (such as homicide, self-inflicted death, self-harm requiring hospitalisation, and serious incidents of violence). We aim to determine the empirical points at which prisons shift from ‘poor’/risky, to minimally safe, and better, and to develop a theoretical model to explain these findings.
The significance of thresholds for policy making
Determining where a threshold is to be set can often be a controversial matter; it places a barrier that distinguishes between pass/fail, treatment, no treatment, acceptable/unacceptable, yet in the context of limited resources thresholds are a necessary reality. A criticism often levelled at quality standards is that after a minimum threshold becomes established it can have a negative effect on practice, as it becomes a ‘ceiling’ rather than the ‘bottom line’ it was intended to be. This is an important point, and we share the concerns of many scholars and practitioners about whether absolute standards (e.g. no violence) should be held on to in critical and comparative analyses of prison life and quality (see Crewe et al., 2022). This does not detract from the importance of current oversight in a prison context ‘ … as a means of achieving the twin objectives of transparency of public institutions and accountability for the operation of safe and humane prisons and jails’ (Deitch, 2010: 1438). Furthermore, as Judge Edwin Cameron has pointed out, ‘the culture of prison changes when outsiders shine a light on it’. Armed with empirically derived knowledge about the relationships between standards and outcomes, it becomes harder to defend lower standards.
Previous research
Two main bodies of work exist in this area, although neither have looked directly at thresholds: the first conducted by the authors of this study and colleagues at the Cambridge Prisons Research Centre, and the second consisting of a growing number of international studies aimed at measuring prison social climate and linking these measures to outcomes.
The prison quality or ‘moral performance’ survey developed by members of the Cambridge University Prisons Research Centre (MQPL) attempts to provide a conceptual and methodological foundation for understanding prison life. It is important to be cautious about how well social scientific variables indicate the complex abstract categories they are designed to measure, and this developmental exercise is no exception. Neither the concepts nor the items in the MQPL questionnaire are intended to be definitive. The projects underlying the development and use of the survey represent a series of attempts to reflect with some precision the social, relational and moral climate of a prison, as experienced by prisoners and staff. This places us in a better position to critically analyse the nature, quality, management and effects of prison policies, but the process can always be refined.
The MQPL survey has been developed inductively from extensive, grounded explorations with staff and prisoners about what matters in prison. It has an underlying conceptual framework incorporating notions of legitimacy and ‘right relationships’. The concepts of ‘staff professionalism’ and ‘use of authority’ emerged as key components in this framework, confirming the centrality of the complex work of prison officers to the quality of life in prison. All attempts to measure prison quality tend to include at least the three broad dimensions critical to prison life of ‘relationships’, ‘personal development’ and ‘order and organisation’; these dimensions are broadly related to humanitarian, rehabilitative, and custodial goals respectively.
This social-scientific and conceptual commitment underlying MQPL’s development is one of its most significant properties and may explain its perceived usefulness to senior practitioners. The Prison Service has consistently used the MQPL survey as a key part of its routine audit and evaluation activity across the Prison Service, since first adopting it in 2004. It is often the case that exploratory, innovative, and curiosity-driven research is, in the end, of most value to policy and practice, precisely because it avoids the narrow limits set by ‘working assumptions’. It follows leads originating in ‘the real world’ (this has also been true of other prison research projects conducted outside the policy agenda). The commitment of this kind of research is to experience, or ‘the phenomena and their nature’. Its in-depth qualitative origins explain its face validity (staff and prisoners recognise the results); and its reasonable performance at an explanatory level (the results help to explain variations in suicide rates, levels of well-being, reconviction outcomes, and the risk of disorder). The other significant property of the survey is that it is based on the use of Appreciative Inquiry (AI): that is, the identification of best experiences and core values (see further Liebling, 2015).
The moral and social quality of prison life has been linked to important outcomes for prisoners in many studies across different jurisdictions and using different measures (Barquín et al., 2019; Hassan et al., 2019; Sanhueza and Perez, 2019; Skar et al., 2019). The literature has established that prison social climate has an influence on prisoner wellbeing and behaviour (Wortley, 2002). A systematic review by Gadon et al. (2006) showed that prison social climate was correlated with incidents of violence and disorder in prison. Research has also shown that a more positive social climate is associated with lower behavioural disturbance, higher levels of motivation, engagement with treatment and therapeutic alliance (Long et al., 2011), greater service user satisfaction, more positive therapeutic relationships with staff (Bressington et al., 2011), lower rates of violence (Friis and Helldin, 1994), and more positive treatment outcomes (Long et al., 2011). Furthermore, prisoners housed in institutions who rate their climate negatively have more disruptive infractions and self-report more stress-related illnesses (Wright, 1993). Prisoners from a closed climate characterised by inexperienced workers and strict rules were more likely to report problems with staff integrity, differential treatment and low levels of trust, and these factors could cause considerable stress (van der Helm et al., 2009). By contrast, an ‘open group climate’ – one that maintains a careful balance between flexibility and control though constant monitoring – has the potential to contribute to rehabilitation (van der Helm et al., 2011). A literature review of the relationship between perceptions of social climate in secure forensic settings and aggression found that more open institutional climates, characterised by higher levels of patent cohesion, perceptions of safety, and a more positive atmosphere were related to levels lower levels of aggression (Robinson et al., 2016).
However, as far as we are aware, none of these studies have investigated possible thresholds, or the quality point at which prisons generate positive rather than negative outcomes. Results are treated as normative: variations from a mean are reported rather than attempts to establish minimum or desirable thresholds. This is typical of performance measurement in organisations but constitutes a major omission.
There are minimum standards in mental health care, for example the standards prisons have to achieve to receive the Enabling Environments (EE) accreditation from the Royal College of Psychiatrists, but these are not empirically derived (Paget and Woodward, 2018) and, like audits, tend to be in-put or process-based rather than directly linked to outcomes. There are HM Inspectorate Expectations, and Internationally formulated Prison Rules, but likewise, these are informed by, rather than based on, empirical research. Whilst addressing this question is methodologically complex, we have sufficient relevant data to address it.
The overall aim of this study is to work towards a more refined and systematic development of our understanding of (i) prisons and variations in their moral cultures and (ii) their effects, including relationships with distinct in-prison outcomes such as homicide, self-inflicted death, incidents of self-harm requiring hospitalisation, serious incidents of prisoner-on-prisoner violence, and prisoner on staff assaults. We have already established a significant relationship between moral climates and recidivism outcomes (Auty and Liebling, 2020).
This study consists of a detailed analysis of data from the MQPL which has been conducted in prisons by HMPPS between 6 April 2009 and 9 March 2020 and made available to us in order to carry out this work. As the developers of the survey, we continue to refine and use it in ongoing research. The MQPL survey items comprise 21 conceptual dimensions. 7 Their relationships with in-prison violent outcomes will be examined.
Our expectation is that, on the one hand, outstanding prisons with exceptional or explicitly ‘enabling’ environments’ (not necessarily formally accredited) generate increased safety and the potential for growth and change among prisoners (see Liebling et al., 2019; Liebling, Auty, Gardom, and Lieber, 2022). Others create high levels of fear, distress and despair (Crewe, Liebling, and Hulley, 2015). Order declined significantly in many prisons since 2013 (e.g. an unprecedented eight homicides and 122 suicides took place in prisons in England and Wales in 2015–16). Levels of disorder and suicide returned to, or exceeded, levels found in the 1980s. Some recovery is underway, and this study hopes to contribute to this recovery, with improved knowledge and understanding.
In this study we explore the question of whether there exist key ‘thresholds’: are there ‘dangerously low’ MQPL dimension levels? What is a ‘minimally safe’ threshold? When do prisons become ‘good enough’ in their cultures, practices, and outcomes to ‘minimise the potential for harm’?
One characteristic of a legitimate prison, research suggests, is ‘order’ (where this incorporates right uses of authority, right relationships, and the helpful deployment of professional skills). Order is more than the absence of conflict and has to be ‘worked at’ by prison staff, often in unseen ways. Conflict is detected, ‘channelled, averted, or handled’ (Liebling assisted by Arnold, 2004: 284) continually. In a well-ordered prison, as Woolf proposed, and as prison scholars have repeatedly found, conflict is assimilated, a certain voluntariness prevails; good will is built and rebuilt, boundaries are understood and maintained (Sparks et al., 1996; Liebling, Price, and Shefer, 2010; Liebling, 2022). The particular blend of limit-setting and support in the most ordered prisons is rarely articulated. More direct feedback to prison staff on the outcomes of such work, at its best, would be professionally valuable. The sociology of prison life has visited these themes many times (e.g. Crewe, Liebling, and Hulley, 2014; Gilbert, 1997; Shapira and Navon, 1985; Sparks et al., 1996; Sykes, 1958), but few studies have been able to diagnose different forms of order and link these to the measurement of moral climates or outcomes. MQPL data allows these kinds of explorations, which is especially important when different visions of order are deeply contested. Several studies have noted changes in the model of order pursued over time (Jacobs, 1977; Liebling, 2002, 2021) but only two studies have drawn empirical links between such changes and outcomes, both using MQPL (Crewe at al., 2015; Liebling, 2022). Since the MQPL survey includes value dimensions, and values are often in tension, the following analysis allows us to explore whether certain blends or combinations of values are necessary to keep prisoners safe.
Other background studies relevant to this thresholds analysis carried out by the authors showed that one closed women’s prison improved significantly during a two-year evaluation of suicide prevention practices during which major investments were made in its management and infrastructure. This prison acted as a pilot site for a safer prisons project, during which its culture, safety, and levels of care for women were improved. Its MQPL scores started at the lower end of the range described here but moved upwards to the ‘safer’ end of the range by the end of the two-year study, following significant efforts made by a new senior management team. The high rates of suicide and distress that brought the prison into the pilot study were reversed suggesting, first, that improvement is possible and, secondly, that a plausible causal relationship exists between improved moral climates and outcomes. These results, and the processes of improvement, are reported more fully elsewhere (Liebling, in progress).
Finally, in a mixed method study carried out within the high security estate, two prisons differed significantly in their MQPL scores. These differences were linked to substantial divergences in faith practices as well as violent outcomes, with the higher scoring prison exhibiting fewer power struggles between prisoner groups and less violence (Williams and Liebling, 2022). The underlying difference between the lower and higher scoring establishments was the existence of an I-Thou (tragic and exploratory) rather than I-It (cynical and narrowly security-oriented) culture among staff (Liebling, in progress). Prisoners in the better prison were regarded as persons with the ‘capacity to develop themselves’. The staff were thoughtful and well trained. These and other studies suggest that whilst the highest MQPL scores are found in open and Category C prisons in the analysis reported here, better moral climates can be found elsewhere in the penal system. Continuing efforts should be made to compare moral climates, and a range of outcomes, within security categories.
Methodology: Identifying thresholds
The analysis proceeded in two stages. In the first, we examined prisons we know well where incidents had occurred (e.g. suicide, disorder, an escape, or a hostage-taking) that could be made sense of in light of prison quality data collected before the incidents took place, or where before–after evaluations had been carried out (e.g. on suicide prevention effectiveness). These case studies included qualitative as well as MQPL data. They make up the first part of the study and are both retrospective and exploratory. The results from these studies have been published elsewhere; Liebling assisted by Arnold (2004); Williams and Liebling (2022). The lower and safer thresholds in these particular cases are similar to those reported in the large-scale analysis. The second (main) part of the study (reported here) analyses a much larger quantity of secondary data, using the well-established prison moral quality survey we developed. HMPPS colleagues have administered the survey over an 11-year period (6 April 2009–9 March 2020) 518 times. 8 A total of 55,665 individual surveys were administered in 144 different prisons. The number of surveys conducted during this period in each prison ranged from one to six with an average of four. Therefore, each survey within the 11-year period is of an individual prison, administered over a few days, with prisoners being asked to respond in terms of their experience over the past few months. In these analyses we identify the MQPL thresholds below which indicators such as violence and suicide rates increase. We also identify the highest quality prisons where rates of violence and suicide are lowest.
Procedure
First, a suitable database including all relevant variables was assembled. The mean MQPL dimension scores for the 518 prison surveys were matched to variables containing data on serious forms of violence published by the Ministry of Justice. The five forms of violence that we looked at in this analysis were serious prisoner on prisoner assaults, serious assaults on staff, self-harm incidents requiring hospital treatment, self-inflicted deaths, and homicides. For each of these forms of violence we computed incident rates (rate per 1000 prisoners). Official data on self-inflicted and homicide deaths in prison are reasonably reliable in England and Wales, since a verdict of ‘suicide’ at a coroner's inquest has not been required for a death to be recorded as self-inflicted since 1988 (currently ‘any death of a person who has apparently taken his or her own life irrespective of intent’, (MoJ, 2012: 11); (see also Dooley, 1990). There can still be categorisation errors and delays to classification. Data on serious assaults and self-harm are less reliable, but improvements have been made to recording and operational definitions, and entries are checked (MoJ, 2012). Prisons and Probation Ombudsman (PPO) inquiries provide another layer of scrutiny on the number of cases (see, e.g. PPO, 2023). The official data was matched to the MQPL survey data on a year-by-year basis, so that that violence outcome data was matched to the MQPL survey data for the same year.
Analytical approach
First, the data were explored descriptively. The minimum, maximum, and mean scores for all MQPL dimensions were examined. There is good evidence that the types of violence analysed in our study often occur together and share common risk and protective factors (Decker et al., 2018), therefore we sought to create a variable that combined them. A factor analysis was conducted on the five variables that captured the rate of violence (per 1000 prisoners) for serious prisoner on prisoner assaults, serious assaults on staff, self-harm incidents requiring hospital treatment, self-inflicted deaths, and homicides. The Eigen value indicated that one factor was sufficient, and then we produced factor scores for this latent ‘prison violence’ variable for use in the next stage of analysis.
To achieve an indication of any potential thresholds histograms were produced for each of the MQPL dimensions and examined. We looked for distinct peaks or modes in the distribution. As the peaks represent regions where the data is concentrated, a clear separation between peaks may suggest the presence of a threshold. If there is a notable gap or dip between two peaks, this indicates a potential threshold point. However, it is important to note that the visual inspection of a histogram alone may not provide sufficient evidence of a threshold. It can serve as a preliminary step to identify potential patterns and guide further analysis.
An analysis of modes was conducted and the findings suggested that multiple modes existed for all but four of the dimensions (Staff professionalism, Prisoner adaptation, Conditions, and Personal development). The distributions for each of the MQPL dimension scores were dissected and higher and lower threshold scores were produced from examining each tail end of each distribution. Each threshold was rounded to the nearest 0.5.
Piecewise regression
To gain more confidence in the existence of our proposed thresholds, we combined the visual inspection with piecewise regression (Wainer, 1971) to provide a rigorous statistical assessment of whether a threshold truly exists and help estimate its location with greater precision. Piecewise regression involves fitting two or more linear regression models to different parts of the data. The breakpoints where the regression lines change represent potential thresholds.
The first stage of this analysis involved running two separate regressions: one for below the lower threshold and one for above the lower threshold. This was done for every MQPL dimension. The results of these two models were then compared. At the second stage we centred the MQPL dimension score variable on its threshold value to make the interpretation of the coefficients in the subsequent regression models clearer. The third stage involved producing a single model that combined the two models described above. Finally, we performed two tests; first to see if the intercepts in each of the two models were significantly different from 0, and second to see if the slopes in each of the two models were significantly different from 0. This procedure was then repeated for the safer (higher) threshold.
Logistic regression
Logistic regression models were then produced to explore the relationship between; (i) prisons that do not reach the lower threshold and the likelihood of them having one of the five serious violence incidents occur, and (ii) prisons that do not reach the ‘safer’ threshold and the likelihood of them having one of the five serious violence incidents occur. Binary variables were created for this purpose; one to indicate (i) the prisons that had not reached the lower threshold, and one to indicate (ii) the prisons that had not reached the upper (from here on referred to as ‘safer’) threshold (the independent variables). Five binary variables (one for each form of violence) were created to indicate that prison had experienced one or more of each form of violence that year (the same year as the survey). This was the dependent variable. Relationships were modelled using the multilevel random-effects logistic regression LOGIT command in STATA. The cluster(PrisonName) syntax was used to take into account the non-independence of prisons that were surveyed more than once during the 11-year period. Finally, the profile of a poorly-performing prison that did not meet the minimum safety/low-violence threshold was examined. We produced mean incidence rates (per 1000 prisoners) for each of our five forms of violence for prisons not reaching this threshold. This was repeated for a well-performing prison that met the higher threshold; we again examined mean incidence rates of each of our forms of violence in these prisons. All analyses were conducted in SPSS version 28 statistical software and STATA version 16.1 statistical software for Windows (StataCorp LP, College Station, TX, USA).
Results
Minimum, maximum, and mean values for each of the MQPL dimensions are shown in Figure 1. The graph shows that there is considerable variation in MQPL scores. As the items in the MQPL survey are scored on a 5-point Likert scale ranging from 1 = Strongly Disagree to 5 = Strongly Agree, previously a neutral threshold of 3.00 has been used to distinguish between an overall positive view on that dimension (for scores greater than 3.00) or an overall negative view for that dimension (for scores below 3.00). In Figure 1 we can see that the mean line is very close to this neutral threshold for the harmony dimensions, but for later dimensions there is more variation.

Variation (minimum, maximum, & mean) in Measuring the Quality of Prison Life (MQPL) scores for all prisons in sample (n = 518).
Table 1 gives our estimates for the lower threshold for every MQPL dimension. Particularly low scores were produced for the dimensions, Decency, Bureaucratic legitimacy, and Well-being. Table 1 also gives the results for a piecewise regression that produces two models; one for the data below the lower threshold and one for the data above the lower threshold. These two models are produced for every MQPL dimension and can be compared. For example, the lower threshold for the Entry into custody dimension was set at 2.55, therefore, the coefficient for Slope 1 is the slope when the score for Entry into custody is less than 2.55, so a one-unit increase in the Entry into custody score results in a 1.05 increase in the combined prison violence factor score, although this finding was not found to be statistically significant. The coefficient for Slope 2 is the slope when the score for Entry into custody is greater than 2.55. The Intercept 1 coefficient is the predicted mean prison violence factor score for a prison that is just below the lower threshold and the Intercept 2 coefficient is the predicted mean prison violence factor score for a prison that has just crossed the lower threshold. The final two columns firstly report the results of a test to see whether the difference in the intercepts of the two models is 0. So, as a prison's Entry into custody score reaches 2.55, we see a decrease in the prison violence factor score of 0.19, but this is not statistically significant. Secondly, we have the results for a test of whether the slopes of the two models are different. For Entry into custody, the slope of prisons that have reached 2.55 is −2.13 and this difference approaches statistical significance. These results are repeated for the higher or ‘safer’ threshold in Table 2. From these two tables we can observe that the MQPL dimension scores above and below both thresholds are most appropriately modelled using two separate regression models; almost all relationships are statistically significant. There was more limited evidence that the differences between the models in terms of their slopes and intercepts were statistically significant.
Piecewise regression model for lower threshold.
p < .10, * p < .05, ** p < .01, *** p < .001.
Piecewise regression model for ‘safer’ threshold.
p < .10, * p < .05, ** p < .01, *** p < .001.
The results of our regression models that relate to prisons that do not achieve the lower threshold to our five forms of serious violence are shown in Table 3. Many of the MQPL threshold scores were significantly related to one or more of the violent outcomes. For example, our models produced odds ratios 9 that suggest that prisons that do not reach a threshold of 3.05 for their staff–prisoner relationships scores are over 23 times more likely to have a serious prisoner on prisoner assault, 7.46 times more likely to have a serious assault on staff, 7.88 times more likely to have a serious self-harm incident, and 2.26 times more likely to have a self-inflicted death when compared to prisons that have met this threshold. The models produced many large odds ratios, suggesting strong relationships: that is, that this lower threshold is an excellent predictor of violent outcomes. It is interesting to note that no one dimension, or one set of dimensions (e.g. the security dimensions) act as the most important predictors of serious violence in prison. All the dimensions appear to be important predictors of violence here. It was not possible to produce models for every relationship due to limitations with the data as indicated by ‘–’.
Lower thresholds and their relationship to serious forms of violence in prison.
p < .10, * p < .05, ** p < .01, *** p < .001.
Table 4 reproduces this analysis for prisons that do not pass the higher or ‘safer’ threshold. The logistic regression analysis produced odds ratios that suggested that violent incidents were also likely to occur in prisons that did not cross the safer threshold in their MQPL scores. The first significant result tells us that prisons that do not achieve an Entry into custody score of 3.15 or higher are 9.74 times more likely to have a serious prisoner on prisoner assault, 10.24 times more likely to have a serious assault on staff, 12.60 times more likely to have a self-harm incident requiring hospitalisation, and 6.65 times more likely to have a self-inflicted death, when compared to prisons that have met the safer threshold. Again, there were many significant relationships and all safer thresholds appeared to be important predictors of serious violent incidents. We can conclude here that our models suggest that prisons not achieving the safer threshold are at fairly significant risk of a serious violent incident occurring.
Prisons that do not pass the ‘safer’ threshold and their relationship to serious forms of violence in prison.
p < .10, * p < .05, ** p < .01, *** p < .001.
Note: ‘–’ = predicts score perfectly.
Table 5 gives mean incidence rates (per 1000 prisoners) of serious forms of violence in poorly performing prisons that do not achieve the lower threshold. The rates are mostly the same size for each form of violence; for example, the mean incidence rates for serious prisoner on prisoner assaults generally range between 30.00 and 40.00, and for homicides they are generally around 0.10. This suggests that the threshold is in the correct place for each MQPL dimension. Table 6 gives the same figures for the well-performing prisons that do meet the safer threshold. Mean incidence rates of violence are much lower, as expected, and for homicides they are almost 0 for every MQPL dimension. That these figures are not 0 reminds us that we are working with a concept of ‘safe enough’ that remains open to discussion.
Rates of serious forms of violence in prisons not achieving the lower threshold.
Rates of serious forms of violence in prisons achieving the safer threshold.
Figure 2 shows that some prisons (the number varies for each dimension) are performing at or above the ‘safer’ threshold, suggesting that these kinds of scores are achievable. These are mainly open prisons. A preliminary further analysis – to be pursued – suggests that thresholds may be slightly different for different types of prisons (open/closed) holding different populations (women, the young) but that the basic model holds. 10 It is interesting that the same set of dimensions are relevant to a range of violent outcomes (at different thresholds) suggesting, as we already know, that risks of violence are clustered and that poor prisons tend to be more violent in many respects, whereas better prisons tend to deliver better outcomes on a whole range of variables. Moral deprivation – or being in places where we don’t matter – creates distress, anger, and risk. Conversely, presence, ‘deep regard’, fairness and safety help us to live and develop (Liebling, in progress; Beedon, 2022).

Variation (minimum, maximum, & mean) MQPL scores for all prisons in sample compared to Safe and Lower Thresholds (n = 518).
The explanation for these findings is likely to be related to the kinds of beings we are, the needs we have, and the difficulties that penal systems have in treating us accordingly, because of their essential function (punishment) and their organisation (bureaucratic and under-resourced) as well as the fact that they contain vulnerable populations who are sensitive to injustice for increasingly long periods. One of the things that matters about human beings generally is that we are ‘sentient beings, capable of flourishing and suffering, and particularly vulnerable to how others treat us’ (Sayer, 2011: 3). We vibrate, or respond to others, and they to us, in ways that can help or hinder our well-being and development (Liebling, 2020). This explains both the content of the dimensions to emerge in the founding studies and their negative impact when they are in short supply. Human beings do not survive well in unjust, unsafe, depriving, and indifferent environments. Moral philosophy has, since its inception, described the essential role of values in human well-being (see, e.g. Kraut, 1989). As Sayer argues: [T]he quality of people's lives depends hugely on the quality of the social relations in which they live, and on how people treat one another’ (2011: 7).
We also know that there is no master value, that humanity and care are two among many, and that we also need clarity, justice and safety in order to survive and flourish.
Limitations
This was an exceptionally difficult and time-consuming study to carry out, partly because we had to create a single database from many different and hard to access sources. There were considerable delays in accessing data and many obstacles to overcome. We are aware that all administrative data have errors in them. We had to find ways of attributing data from a prison ‘cluster’ to individual sites in some cases and where this was not possible, the data were treated as ‘missing’. The database stops at the beginning of the Covid pandemic as restrictions on the conduct of MQPL surveys and on regime delivery made this period exceptional. Other methods have been required to explore the quality of prison life for prisoners during this period (see Maruna et al., 2022). Nevertheless, we remain confident that this now constitutes a careful and uniquely comprehensive data source on prison quality and outcomes, that this is the kind of analysis that prison services might wish to pursue in order to help fulfil their public mandate (see Sprott and Doob, 2020), and that the methodology devised could be replicated in future.
Some further considerations regarding methodology are that the MQPL survey data were linked to data on serious violence for the same year. This makes no assumption as to the relationship between prison culture (i.e. MQPL scores) and prison violence. It could be the case that moral and social culture of a prison deteriorates for a period of time before serious violent incidents start occurring, but it is equally plausible that the occurrence of serious violent incidents leads to prisoners rating the culture of their prison more poorly. This question could be the focus of further analyses, although several case studies show that strategic and focused improvements to MQPL scores led by senior managers have reduced suicides, for example. Furthermore, as the unit of analysis in this study was the prison, we were unable to investigate the extent to which individual-level characteristics might be related to serious violence in prison. This has been the subject of other studies (for a review of the literature see; Schenk and Fremouw, 2012). We see our focus on the detailed understanding of prison environments to be one of the strengths of this analysis.
Discussion
The aim of this study was to develop an empirically and theoretically derived conceptual model of prison quality, showing where higher (‘safer’) and lower (‘very unsafe’) thresholds could be found. We set out to discover the empirical differences between violent versus ‘safe and decent’ prisons. We found considerable variation in MQPL scores for the 518 prisons in our sample, suggesting that some prisons performed highly in terms of prisoner quality of life, and some performed very poorly. The initial modal analysis indicated that the distributions for majority of the dimensions contained more than one mode. This suggested that thresholds could exist at each end of the distribution for most of the MQPL dimensions. We were then able to produce a set of lower thresholds and higher ‘safer’ thresholds. We found that the scores of prisons below the lower threshold had a very strong relationship with each of our five serious forms of violence in prison. Similarly, in prisons that had managed to cross the safe threshold, MQPL scores also had strong relationships with incidents of violence in their prison, but these prisons were at considerably lower risk of those incidents occurring. We produced mean incidents rates for each of the two groups of prisons (i) those below the lower threshold, and (ii) those above the safer threshold. The difference in rates of violence between the two was striking.
The difference between a violent prison and a ‘minimally safe’ prison (in so far as we can use this kind of terminology – indicating low rather than no risk of violence) is, taking examples, scores of 3.05 for staff–prisoner relationships, 2.80 for humanity, and 3.00 for policing and security at the low end versus scores of 3.55 for staff–prisoner relationships, 3.35 for humanity and 3.45 for policing and security at the ‘minimally safe’ end. These are substantial differences, found throughout the survey, and reflecting the fact that to operate a safe prison, a combination of harmony, security and professionalism dimensions must be achieved. The distance between the lower threshold and the ‘safer’ threshold was especially large in some instances (for Respect/courtesy, Decency, and Wellbeing). The narrowest distance between the two thresholds was for Staff professionalism (0.20). All of the safer threshold scores are above 3, and those in the harmony and security clusters are particularly high. Prisons with higher dimension scores are not just ‘a bit different’ from prisons with lower scores on a linear scale, they are qualitatively distinct: staff in these prisons demonstrate a fundamentally different approach to their work. They operate with due attention to security and order, but also with positive underlying assumptions about prisoners, punishment and rehabilitation, demonstrating very different dispositions and practices from staff in lower threshold prisons (e.g. they have a ‘tragic’ rather than a ‘cynical’ perspective; Liebling and Kant, 2018). Liebling has referred to these differences using Buber's terms I-It and I-Thou relations (where prisoners are treated as ‘experienced objects’ or ‘experiencing subjects’ respectively; Buber, 1970). Staff in I-Thou prisons tend to have much higher levels of experience, training and professional support (see Williams and Liebling, 2022). The scores reflect real and tangible differences in the moral environment. We appreciate that these kinds of scores, and environments, are very difficult to manage into being, especially from a low baseline.
The prisons with the highest levels of violent outcomes have particularly low scores on entry into custody (2.55), respect (2.65), decency (2.40), bureaucratic legitimacy (2.25), fairness (2.65), and organisation and consistency (2.55). The very low well-being score (2.30) is linked to an almost six times greater risk of suicide in these prisons.
The thresholds we have identified in this study are suggestive and raise many questions about what a ‘good enough’ or legitimate prison climate might look like. External commentators on our work have proposed that any satisfactory threshold should be set at the point of no violence. Some prison practitioners, on the other hand, query the concept of ‘low violence’ as an aspiration. Bowling used the term ‘good enough policing’ to challenge practitioners and academics to find ways to describe ‘everyday good policing’ and to ensure an acceptable minimum level of practice (Bowling, 2007). The conversation in relation to prisons, with data to hand, is an important one, particularly at a time of low resourcing. Policing and prisons make (contested) claims to be ‘public goods’ (p. 18) but unlike ‘good enough parenting’, a term used by psychoanalyst Donald Winnicott about mothers, they rely on doing ‘bad things’ (intrusion, coercion, detention) to ‘those who present an actual or perceived security risk’ (p. 19). We have not, as a community, come to a stable conclusion about how much security or punishment we need. But asking what kind of imprisonment is ‘good enough’ might be one way of stimulating this kind of dialogue. Perfectionism (ruling out a lenient response to the imperfections of others) can be an obstacle to ‘good human relations’ and to social order (Bettelheim, 1988, p. ix; and see Liebling, 2022).
We have shown that it is possible to empirically determine what the dimension profile of a good or ‘safer’ prison looks like. We have also shown that it is possible to determine a lower threshold profile that could act as an early warning signal, either overall or for specific dimensions. The results suggest that high scores on a blend of security/policing and harmony/relational dimensions are key to achieving ‘good enough’ prison performance. This is consistent with our other work on reconviction outcomes (see; Auty and Liebling, 2020. The challenging and perhaps unsurprising finding is that most prisons are below rather than at or above this threshold.
Footnotes
Acknowledgements
The authors would like to thank His Majesty’s Prison and Probation Service and The Leverhulme Trust for funding this research project. They would also like to thank the anonymous reviewers for their comments on this paper and the HMPPS Operational and System Assurance Group for collecting the data and granting us access to it.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partly funded by His Majesty's Prison and Probation Service. Research for this article was also supported financially by Alison Liebling's Major Research Fellowship from the Leverhulme Trust (MRF-2019-011).
