The behavioural house indicator: A faster and real time small-area deprivation measure for England

Abstract

Researchers have been long preoccupied with the measuring and monitoring of economic and social deprivation at small scales, neighbourhood, level in order to provide official government agencies and policy makers with more precise data insights. Whist valuable methodologies have been developed, the exercise of data collection associated with these methods tends to be expensive, time consuming, published infrequently with significant time delays, and subject to recurring changes to methodology. Here, we propose a novel method based on a straightforward methodology and data sources to generate a faster and real time indicator for deprivation at different scaling, small to larger areas. The results of our work show that our method provides a consistent view of deprivation across the regions of England and Wales, which are in line with the other indexes, but also highlight specific flash points of deep rural and highly dense urban deprivation areas that are not well captured by existing indexes. Our method is intended to aid researchers and policy makers by complementing existing but infrequent indexes.

Keywords

Mutual information house transactions behavioural analysis

Introduction

The need to identify, measure and categorise distinct levels of economic and social inequalities is a common issue that preoccupies both academic researchers and government agencies (Green et al., 2018; Lloyd et al., 2023a; Noble et al., 2019a, 2019b). To address this, a number of indices (Green et al., 2018; Lloyd et al., 2023a; Noble et al., 2019a; The United Kingdom Census, 2021) have been developed to compare poverty and deprivation in distinct regions across the land at both micro and macro levels. Unsurprisingly, each of these indexes provide distinct results and highlight specific features that are not easily captured by the others, as there are distinct perspectives on exactly what is being measured as well as how to measure.

In relation to what is being measured, it can be argue that the distinction between poverty and deprivation is broadly agreed, as colloquially explained by the ONS (ONS, 2019): ‘People may be considered to be living in poverty if they lack the financial resources to meet their needs, whereas people can be regarded as deprived if they lack any kind of resources, not just income’. The general understanding, therefore, effectively implies that poverty is fundamentally about money (wealth, income, etc.), and a sub component of the overall concept of deprivation. The definition of money, however, is much more vague as any kind of resource can be measured and interpreted in many ways, and this is where each of the indices starts to deviate conceptually, and so results vary significantly.

When it comes to how to measure, the traditional approach (Green et al., 2018; Lloyd et al., 2023a; Noble et al., 2019a; The United Kingdom Census, 2021) taken by researchers is to narrow down their specific conceptual definition of deprivation by compartmentalising deprivation into dimensions, or sub components, so that it becomes easier to define each lack of resource more specifically. Each domain is then weighted, either implicitly or explicitly, and an overall single index is produced. This is essentially the general approach adopted by all existing indexes described in this paper.

The above approach, however, requires significant number of arbitrary judgements (Noble et al., 2019a, 2019b) both in assigning the hierarchy and importance of each domain in the weightings used, and in the methods to harmonise individual measures of a monetary (financial and economic) or non-monetary (social) nature. Inevitably, the need to address these issues tends to lead to extensive and complex methodologies that require continuous re-assessment every time new data is published, together with expensive and time-consuming data collection exercises. As important is the fact that the process becomes costly and publication are results are infrequent with long time lags. It is of little surprise then that authorities and bodies such as the ONS in the United Kingdom are looking at potential alternative methodologies to capture real-time indicators of social change (ONS, 2024). However, whereas the existing indexes cited in this paper are helpful to provide single quantifiable values at a given point in time, it can be argued that they are essentially mathematical exercises driven by the current computational capabilities and data availability. Further, current indices do not focus on the theoretical framework underpinning the contested concept of deprivation within social sciences, leading to continuous methodological changes.

It is within this context that we place the aims of and motivation for our research. Whilst we do not attempt to define deprivation precisely given its contentious nature, we place our work in the context of some early and pioneering studies within the UK where deprivation is an emerging property resulting from the relationship between social and spatial processes that are complex and multilateral (Davidson, 1976; Haris, 1973). Importantly, short-term changes in spatial structure are primarily generated by residential moves, in particular in the context of urban renewal, new housing and rehousing that leads to increasing spatial inequalities (i.e. the divergence hypothesis) (Parker, 1973). In essence, we developed a simple measure, the Behavioural House Indicator $\hat{W}$ based on patterns of residential movements, that can be used to measure deprivation quickly and consistently in short real-time intervals. Moreover, we intend to fill the existing large gaps between the timelines of publication of the other more established indexes by comparing and contrasting our results to these indexes.

In order to do so, we developed a framework and methodology that is computationally simple, which has no arbitrary parameters, and requires no data transformations. Moreover, we make use of a single data source, the ‘HM Land Registry’. The HM Land Registry is a government controlled entity that keeps all the relevant property records. In particular, each time a property changes ownership, a new transaction record is recorded at HM Land Registry. This data is (a) factual and objective (i.e. no assumptions in its construction), (b) publicly available (i.e. at no cost), (c) published promptly and at short time intervals (i.e. fast) and (d) detached from any economic and social data collection (i.e. independent). In addition, the data set is longstanding, extensive, and methodologically stable.

Our process, detailed below, is to use HM Land Registry data to obtain sequences of transactions for each individual property. From this we construct a conditional probability table for the day of the week of pairs of subsequent transactions. These are studied for each region (LSOA or MSOA): we calculate the corresponding mutual information (Shannon, 1948) for each region, determine the average mutual information of the neighbours for each region, and generate our Behavioural House Indicator $\hat{W}$ based on the relationship between the two quantities of mutual information.

Our proposed framework and method is underpinned by some important conceptual principles derived from complexity (Huang and Ulanowicz, 2014; West, 2017), information theory (Dehmer et al., 2011; Sanchirico and Fiorentino, 2008; Shannon, 1948), and mathematical finance (Rebonato, 2004). Fundamentally, we highlight the fact that we are dealing with economic and social complex systems that have features and elements highly dependent and intertwined to each other, and therefore with high levels of correlation and causality. This fact in isolation has important implications.

Firstly, it follows that any measure for deprivation must be mathematically non-additive so that the effects of correlations are not compounded (Chateauneuf and Cornet, 2022; Rebonato, 2004). This is fundamentally distinct from the traditional additive approach of aggregating different measures (where the weightings are effectively an attempt to correct from the lack of computation of correlations).

Secondly, in order to avoid poverty (monetary) measures trumping the social elements, we purposefully avoid any monetary quantities. This also has the additional benefit these monetary quantities tend to be noisy given the temporal and relative nature of money as storage of value. So we avoid complex financial issues (Biggeri and Ferrari, 2010; Whitehouse, 2009) such as inflation, purchase parity, non-declared earnings that have different effects in distinct economic regions and social classes.

Lastly, we rely on the general research findings that different forms and levels of deprivation lead to a range of distinct behaviours (Anand et al., 2021). As we are not preoccupied with specific definitions, we make use of mutual information as the preferred measure to compute changes to behaviours. Importantly, this means that our research is geared towards the relative measurement and detection of areas of high level deprivation. It does not, however, provides any specific explanation to the specific causes of deprivation.

We narrow our study to the economies of England and Wales. We exclude the two other regions of the United Kingdom, Scotland and Northern Ireland, for two reasons. Firstly, the same level of data is not publicly and freely available and, secondly, the laws and regulations concerning house transactions in Scotland are fundamentally distinct from those of England and Wales. Furthermore, the number of yearly property transactions in Scotland and Northern Ireland are much smaller, being around 10% and 3%, respectively, of all transactions within the United Kingdom. In some cases, when comparing our Behavioural House Indicator $\hat{W}$ to other measures and indices, we will further limit ourselves to data from just England to ensure we match like with like. Whilst all data related to employment, education and health for England is within the ONS remit, some of these elements related to the remaining regions of the United Kingdom (Scotland, Northern Ireland and Wales) are the responsibility of their respective devolved governments, including the all important ten-year census. For consistency therefore, some of our analysis is for England only.

The final aspect in our analysis is the definition of the geographical areas used for our analysis areas. We will use data from the postal service, to link properties listed in HM Land Registry transactions to postcodes. Postcodes identify small regions, typically containing 15 and usually less than 100 neighbouring properties. The ONS often provides information on three larger scales. We will discuss these definitions of geographical areas as needed in section with additional information given in Section 6 of the Supplementary Material.

It is important to note that some measures are tied to one predefined geographical structure. For example, the Index for Multiple Deprivation is only given in terms of one of these definitions, the smallest scale from the ONS known as the Lower layer Super Output Area. In this respect, our Behavioural House Indicator $\hat{W}$ has a clear advantage in that it is available at a fine grained level and is easily computed for any geographical structure required.

Methods

Data sources and collection

We make use of three distinct groups of data sets based on their usage. The first group is the property transaction data, sourced from the HM Land Registry, which is used to compute our fast, real-time, Behavioural House Indicator $\hat{W}$ under the methodology described later. The second group consists of the geographical data sourced from the ONS and the Ordinance Survey which is used to map the property postcodes to the distinct area aggregations (such as the LSOA11, LSOA21 and MSOA21 discussed in section), to pinpoint geographical locations, to compute the density and to determine the neighbours of the relevant area aggregations. The third group relates to social and economic data sourced from the Census 2021 or index data published by government agencies or academic research. This last set of data does not form part of any input to our methodology which is totally independent from these sources. Instead, these data sources are solely used for results comparison and benchmarking and they are not transformed or modified in any way beyond simple arithmetic operations to support aggregation and the computing of basic statistical measures.

Property related data

The core data for our analysis is the Price Paid database sourced from the ‘HM Land Registry for England and Wales’. The data set records basic data (i.e. the transaction date and price paid) as well as the address for every single registered residential property sale in England and Wales from January 1995 to June 2023.

We make use of the transaction date and the postcodes as raw data fields. We compute and assign a unique ID h for the property by matching the address fields (flat and house numbers, street and postcode) within the data set.

Geographical data

Geographical data is sourced from the Office for National Statistics (‘ONS’) via the data.gov.uk website. These consist of the Postcode directory, data on the network of roads, and the shape files of geographical boundaries including countries, local authorities, Output Areas ‘OA’, Lower Layer Super Output Areas (LSOAs), and Middle Layer Super Output Areas (MSOAs). Some of these boundaries are redefined after a census (carried our every ten years) so we refer to ‘LSOA11’ and ‘OA21’ to show that these are the definitions for regions defined for the census 2011 and 2021, respectively. In addition, auxiliary mapping files (equivalence between LSOA11 to LSOA21, for instance) are also sourced. The LSOAs are the most granular level of analysis carried out by policy makers, and are the basis for our research.

Economic and social indices

We make use of two main data sources in relation to the economic and social indices.

The first data set we use is related to the English Index of Multiple Deprivation $M$ for 2019 (Noble et al., 2019a; Noble et al., 2019b; ONS, 2019) published by the Ministry of Housing Communities & Local Government and again sourced from the data.gov.uk website. The index has an embedded hierarchical aggregation with four layers, which, from top to bottom, are referred to as: the overall index, (seven) ‘domains’, ‘subdomains’, and ‘indicators’. To avoid misunderstanding, we note that the use of the word indicator within the framework of the Index of Multiple Deprivation $M$ is substantially different from our conceptual definition described in the introduction. Here, the word is simply applied to define an underlying data source that is used to calculate a subdomain.

The second data source is that of the 2021 Census (The United Kingdom Census, 2021), including population, households by deprivation dimensions, ethnic groups and others. This data comes from the ONS census and labour market statistics website nomisweb.co.uk. For more general references, analysis and consistency purposes, we also make use of the Access to Healthy Assets and Hazards for Great Britain (‘AHAH’) (Green et al., 2018), which is part of the data available through the Consumer Data Research Centre. The multidimensional index summarises health-related features of neighbourhoods, such as the retail environment, health services, the physical environment and the air quality that have some high level of overlapping with both $M$ and the Census households by deprivation dimensions. This data contains four ‘dimensions’ of deprivation that are used for computing the level of deprivation.

As detailed footnote, $M$ was published based on the existing LSOAs at the time, that is, 2011, and there were a very limited number of changes between 2011 and 2021. As the Census 2021 data is based on the latter date, we remapped $M$ to LSOA21, where a split of a region maintains the same value for both new regions and merger changes the value to the simple average numbers of the previous regions.

Data analysis and construction of the behavioural house indicator $\hat{W}$

Each property transaction $τ \in T$ is represented by a pair (h, t) where $h \in H$ is a unique property and t is the date of transaction t (e.g. our first date is 1st January 1995). The set of all property transactions $T$ contains around 27.2 million transactions involving around 14.8 million distinct properties in $H$ over the period ranging from 1st January 1995 to 30th June 2023.

We are interested in the day of the week so we define a map from a date t to a day $d (t) \in W = {M o n d a y, \dots, S u n d a y}$ .

Frequency of transactions and transaction pairing probabilities

The frequency of transactions ϕ_d for a given day of the week $d \in W$ can be written as

ϕ (d) = \frac{1}{| T |} \sum_{(h, t) \in T} δ (d, d (t))

(1)

where δ(x, y) = 1 if x = y and is zero otherwise.

Weekday pairing probability

We now focus on days of the week, $W = {M o n d a y, \dots, S u n d a y}$ . We are particularly interested in the potential conditional and causal relationships in the dates of transactions of the same property.

A simple way to study this is to look at consecutive transactions of one property. To do this we define a sales history s(h) for a given property h which is the sequence s(h) = [t₁, t₂, …, t_n] where t_i < t_i+1 and {(h, t_i)|t_i ∈ s(h)} is the subset of all transactions involving the property h.

For every single transaction (h, t_i) from the sales history s(h) of each property $h \in H$ , we find the next transaction for that property which will be the transaction (h, t_i+1). Now we can then count the number of consecutive transactions L(d₁, d₂) for the same property where the first transaction happens on day d₁ = d (t_i) while the subsequent transaction is on day of the week d₂ = d (t_i+1) so

L (d_{1}, d_{2}) = \sum_{(h, t_{i}) \in T} δ (d_{1}, d (t_{i})) δ (d_{2}, d (t_{i + 1})) .

(2)

This gives us the probability of a joint transaction P(d₁, d₂) as

\begin{matrix} P (d_{1}, d_{2}) & = & \frac{1}{Z} L (d_{1}, d_{2}), d_{1}, d_{2} \in W \end{matrix}

(3)

\begin{matrix} Z & = & \sum_{d_{1} \in W} \sum_{d_{2} \in W} L (d_{1}, d_{2}) . \end{matrix}

(4)

Note that if a transaction τ = (h, t_i) has no subsequent transaction then we define δ(d₂, d (t_i+1)) = 0 in such cases. That is the set of transactions which give non-zero contributions is

T^{*}

which only contains transactions of properties that are sold at least twice so

T^{*} \subset T

The single probability distributions P⁽¹⁾ and P⁽²⁾ are the probabilities that a transaction chosen uniformly at random from the reduced set of transactions $T^{*}$ (those which involve properties with at least two transactions in the full data $T$ ) is

\begin{matrix} P^{(1)} (d_{1}) & = & \sum_{d_{2} \in W} \frac{L (d_{1}, d_{2})}{Z} = \sum_{d_{2} \in W} P (d_{1}, d_{2}), \end{matrix}

(5)

\begin{matrix} P^{(2)} (d_{2}) & = & \sum_{d_{1} \in W} \frac{L (d_{1}, d_{2})}{Z} = \sum_{d_{1} \in W} P (d_{1}, d_{2}) . \end{matrix}

(6)

Mutual information

The pointwise contribution to the mutual information I(d₁, d₂) and the total mutual information W (Shannon, 1948) is therefore:

\begin{matrix} I (d_{1}, d_{2}) & = & P (d_{1}, d_{2}) \log_{2} (\frac{P (d_{1}, d_{2})}{P^{(1)} (d_{1}) P^{(2)} (d_{2})}), \end{matrix}

(7)

\begin{matrix} W & = & \sum_{d_{1} \in W} \sum_{d_{2} \in W} I (d_{1}, d_{2}) \end{matrix}

(8)

Computing neighbourhood and the average mutual information of neighbours

We will also examine our results for properties in specific areas. To do this start with a partition of the total area into a set of non-overlapping smaller areas $A = {a_{1}, a_{2}, \dots,}$ . A property can only be in one of these subdomains so we denote the set of properties in area $a \in A$ as $H_{a} \subset H$ and the set of transactions of properties in area a as $T_{a}$ . We can then repeat the analysis above for transactions of properties in area a, starting from the count L of (2) which becomes L_a for transactions in area a where now $L_{a} (d_{1}, d_{2}) = \sum_{(h, t_{i}) \in T_{a}} δ (d_{1}, d (t_{i})) δ (d_{2}, d (t_{i + 1}))$ . Replacing L by L_a leads to area specific versions of quantities in (3), (5), (6), (7) and (8), denoted as P_a (d₁, d₂), $P_{a}^{(1)} (d_{1})$ , $P_{a}^{(2)} (d_{2})$ , I_a (d₁, d₂) and W_a, respectively (see Supplementary Material for detailed forms).

In many situations, the areas we use are very small leading to large fluctuations. If we assume many neighbouring areas have similar properties, it makes sense to smooth our measures over slightly larger regions. We adopt a simple local aggregation procedure where we average over the values from an area and its neighbours (see Sections 7 and 8 of the Supplementary Material for a detailed example).

We define two areas a and b to be neighbours if any road (links or nodes) recorded within the Ordinance Survey Open Road data set crosses or touch the boundaries of both areas. We emphasise here that sharing boundaries is not a sufficient condition for a and b to be neighbours. Instead at least one single road must cross (or at the extreme, must touch) the boundary between these two areas. We encode this information in an adjacency matrix E_ab which is 1 if areas a and b are neighbours, while it is zero otherwise (including when a = b).

This allows us to compute a separate mutual information of an area a in terms of an average of its neighbours as

{\bar{W}}_{a} = \frac{1}{k_{a}} \sum_{b \in A} W_{b} E_{a b} where k_{a} = \sum_{b \in A} E_{a b} .

(9)

Generating the behavioural house indicator $\hat{W}$

A scatter plot of the mutual information W_a against average neighbour mutual information ${\bar{W}}_{a}$ for each area a ∈ A we observe, as expected, an approximate linear relationship $\bar{W} = α W + β$ where α and β are independent of the area and can be found from a best fit procedure.

Once these parameters have been obtained, we can now use them to produce an aggregation of the mutual information of an area and its neighbours to produce our Behavioural House Indicator ${\hat{W}}_{a}$ for each area a. We do this by finding the point $({\hat{W}}_{a}, {\hat{Y}}_{a})$ on the best fit line $\bar{W} = α W + β$ which is the closest (orthogonal projection) to the point $(W_{a}, {\bar{W}}_{a})$ through the equations

\begin{matrix} \overset{̌}{β} & = & \frac{W_{a}}{α} + {\bar{W}}_{a} \end{matrix}

10)

\begin{matrix} {\hat{W}}_{a} & = & \frac{\overset{̌}{β} - β}{α + \frac{1}{α}} . \end{matrix}

11)

Ranking areas and bins

With so many areas to study, it is often helpful when discussing and visualising trends to study groups of areas with similar values of some attribute. For instance, for some real valued rating Θ (some index or measure) we often split the areas into deciles, ten equal-size subsets $A (r, Θ)$ where r = 1, 2, …10. Any area a in $A (r, Θ)$ will be in the r-th decile when ranked by rating value Θ, with r = 1 representing 10% of areas with the lowest values of Θ.

Results

Data analysis: The index foundations

Our Behavioural House Indicator $\hat{W}$ is built from the temporal patterns of consecutive house sales and purchases so a key feature is that it does not involve monetary values or measures. Our method uses mutual information between prior and subsequent sales to capture the deviations in the expected patterns.

The relevance of the day of the week and associated measures are illustrated by the four panels in Figure 1, where the unique characteristics associated with Friday related house purchases can be identified. The first panel 1A is a basic frequency bar chart that shows that on average most transactions (49.8%) have occurred on Friday over the last 28 years. A richer insight, however, is obtained when combining the parings of days of the week between prior and subsequent dates of transactions (or purchase and sale of the property by the homeowner). Unsurprisingly, panel 1B shows the prevalence of the Friday-Friday pairing (26.6%), but most importantly, from the perspective of a single homeowner, the date of purchase (i.e. prior transaction) conditionally affects the date of sale (i.e. subsequent transaction).

Figure 1.

Frequency of transactions by day of the week, probabilities and contribution to the point-wise mutual information for days of the week pairing combinations and geographical interconnections. Panel (A) shows the frequency ϕ of single transactions by day of the week (d), with Friday accounting for 49.8% of all transactions. Panels (B) and (C) relates to the combinations arising from the pairing of consecutive transaction for the same property by their respective day of the week (d₁, d₂), where the vertical axes correspond to the prior transaction and the horizontal axes equate to the subsequent transaction. Panel (B) is the heat-map for the probability of pairings P(d₁, d₂), where the combination Friday-Friday accounts for 26.6% of all transactions. Panel (C) shows the contribution to the point-wise mutual information I(d₁, d₂) for each pairing. Panel (D) provides a geographical perspective of the interrelationship between the selected quantities based on the computation of mean coordinates (latitude and longitude). The circles represent the average coordinates of all pairings (d₁, d₂) for three subsets of mutual information, that is, both on Fridays {Fri}, Friday and another day – or vice versa – {Fri, Oth} and both other days {Oth}. The lines and numbers represent the best fit and corresponding mean geographical coordinates of the deciles for each of the quantities associated with the two selected deprivation domains, namely the ‘Health Deprivation and Disability’ and ‘Barriers to Housing and Services’ scores. All data relates to the range between [1-Jan-1995, 30-Jun-2023].

The contribution to the point-wise mutual information I(d₁, d₂) is shown panel Figure 1(C) and it indicates that the Friday-Friday combination occurs much more often than would be expected from random pairs of transactions (I(Fri, Fri) > 0). As a result the combination of a transaction on Friday with a previous or subsequent transactions on another day occurs less often than we would expect for random independent house purchases, I(Fri, Oth) < 0 and I(Oth, Fri) < 0.

Panel 1D is an illustration of the important association and inter-dependencies concerning the geographical location of the transactions. The mean geographical coordinates (based on latitude and longitude of the postcodes) for transactions in one of three categories are shown as coloured circles. The three types of transaction pairs shown are: both Fridays (Fri), Fridays and another day in either order ({Fri, Oth}), and both on days other than Friday (Oth).

On the same panel 1D, we show the property transactions split into deciles by for the scores of a selected deprivation domain. We do this by mapping postcodes to LSOA11) to the corresponding deciles (determined in accordance with the process described above). We show the results for two deprivation domains: the ‘Health Deprivation and Disability’ in blue and the ‘Barriers to Housing and Services’ in brown. In each case the data for a decile is shown by a number in the relevant colour with ‘1’ representing the most deprived and ‘10’ the least deprived by the associated domain. As shown by the best fit lines, the dotted lines in the appropriate colour, the point-wise mutual information for these subset all lie close to these linear fits and there is a clear geographical correlation with the measures shown. This illustrates the so called ‘North-South divide’ within England with large conurbations in the Midlands and North West often linked to post-industrial activity while the South East of England is often seen as dominated by a service-driven economy.

Relationship between the behavioural house indicator $\hat{W}$ and other deprivation indices

In this section, we highlight the relationship between our Behavioural House Indicator $\hat{W}$ and two widely used deprivation related indexes and data sets, namely the English Index of Multiple Deprivation $M$ and its seven domains, and the Census 2021 ‘Households by deprivation dimensions’ data $C$ .

The English Index of Multiple Deprivation $M$

The results in Figure 2 show that there is a significant and meaningful relationship between $\hat{W}$ and the English Index of Multiple Deprivation $M$ for the year 2019 (the last available date) both at overall and at its seven domain levels. All panels within Figure 2 imply high correlation levels, and good linear fitting for the data (which is aggregated into deciles and LSOA11 regions in accordance with the method described in section). The linear fitting is broken, however, at the early deciles for the domains ‘Barriers to Housing and Services’ and ‘Living Environment’. On investigation of this breakdown of the subdomains and indicators related to each, we observe that the main cause of the breakdown for the former are the indicators within the ‘Wider Barriers to Housing and Services’ subdomain, whereas the latter is impacted by the ‘Indoors’ subdomain.

Figure 2.

Relationship between the Behavioural House Indicator $\hat{W}$ and the English Index of Multiple Deprivation $M$ . Each data-point corresponds to all LSOA11s within the first to the tenth deciles as a function of the ranking of $\hat{W}$ . The x-axes are the same for all panels, corresponding to the mean scores of $\hat{W}$ for each data-point. In a similar manner, the y-axes equate to mean scores for the overall $M$ (panel A) and its seven domains (panels B to H) as labelled. The dotted grey line represents the corresponding best fit between the associated quantities.

In order to substantiate the robustness of the relationship, in particular in relation to correlation and linear fitting, we also tested different combinations of bin sizes (not just the ten equal sized bins mentioned in section). We found that correlation levels are consistently high (unsurprisingly better when bin sizes and minimum data points increase) and the slope of the linear fitting remains largely stable regardless of the way we aggregate the data in plots of the type shown in Figure 2.

In addition, it is important to emphasise that these results also indicate a high correlation level among the seven domains in $M$ which poses a question whether the overall methodology for computing $M$ may be simplified by relying of a much smaller set of indicators.

The census 2021 ‘households by deprivation’ dimensions $C$

As it can be observed in Figure 3, and similarly to the English Index of Multiple Deprivation $M$ , our results show that the Census 2021 data $C$ on households by deprivation dimensions have strong relationship with $\hat{W}$ , both in terms of correlation levels and linear fitting. Importantly, these relationships are maintained regardless of the number of dimensions (the data contains four ‘dimensions’ of deprivation) that are used for computing the level of deprivation, or the size of the region of aggregation (i.e. LSOA21 as well as MSOA21).

Figure 3.

Relationship between the Behavioural House Indicator $\hat{W}$ and the Census 2021 households by deprivation dimensions. Each data-point corresponds to all output areas (expressed either as LSOA21 or MSOA21) within the first to the tenth deciles as a function of the ranking of $\hat{W}$ . The x-axes are the same for all panels, corresponding to the mean scores of $\hat{W}$ for each data-point. In a similar manner, the y-axes equate to mean of the ratio (i.e. number of deprived households (in none A, one B, two C or three D of out four dimensions over the total number of households) for the output areas within the decile. The dotted grey line represents the best fit between the associated quantities, excluding the last point.

It is also important to note that $\hat{W}$ is a relative measure that reduces as regions are aggregated from LSOA21 to MSOA21. Yet, the ability to rank from least to most deprived is still maintained. Also, the relative reduction in value is an expected feature since variances from expectations (which is essentially the nature of mutual information) reduce as the number of properties within each region increases.

Whist the Census does not explicitly provide an index to deprivation, one can be easily derived by the ratio between the number of households with deprivation the total number of households. For the purposes of this research, we compute $C$ based on such ratio, by making use of households with any deprivation domain as the numerator.

Analysis of the most deprived LSOAs

Figure 4(A) shows there is a very good overlap of the most deprived LSOAs as measured by the three deprivation measures $\hat{W}$ , $C$ and $M$ . This result in isolation is already remarkable given that, to be fully computed, $M$ is an index that relies on a host of detailed quantities (more than thirty variables across seven dimensions that are further subdivided into sub-dimensions), complex transformation methodologies and arbitrary weightings. In contrast, $\hat{W}$ is based on a single and objectively defined measure (i.e. the day of the week of a house purchase) based on a methodology where no weighting or parameters are required. In a similar vein, $C$ also is based on multiple data related to four separate domains (i.e. employment, education, health and housing) which are costly to obtain (i.e. once every ten years). In contrast to $M$ , however, the methodology in $C$ is much simpler and there is no reliance on transforming of data and weightings (beyond the implied assumption that each dimension is equal to another).

Figure 4.

Relationship among most deprived deciles by the density categories of LSOAs. Panel A is a Venn diagram representation of the logical relation between the Behavioural House Indicator $\hat{W}$ , the Census 2021 Deprivation $C$ and the English Index of Multiple Deprivation $M$ sets. Each circle represents the set of LSOAs within the most deprived decile for the corresponding deprivation measure. Numbers in one of the distinct regions of the Venn diagram reflect the percentage of the LSOAs (from any of the three measures with a non-zero contribution to that region) which lie in that intersection. For instance the dark green region represents the 9% of LSOAs in the lowest decile $\hat{W}$ set that are the same LSOAs which are in the lowest decile IMD $M$ set (the percentage is the same as all deciles have the same number of LSOAs). By implication, none of these LSOAs in the dark green area are in the lowest decile of the Census $C$ measure. Panel B is a semi-log plot where LSOAs are ranked from highest to least density, with the x-axis corresponding to the natural logarithm of the density and the y-axis the corresponding ranking. The thin black line corresponds to the actual data, whereas each of the coloured thick lines represents the sections with distinct fit functions and parameters where the best fit for the curve as whole is optimised. The grey dotted line is the continuation of the green line fitting at the stage that divergences increase (i.e. the fat tail) (Cohen et al., 2020). The table within panel B provides a label for each optimised section, together with the best fitting function and the corresponding number of LSOAs with the sector. Panel C is a cumulative histogram showing the ratio between the number of LSOAs within the most deprived decile to the total number of LSOAs within each density category. The colours correspond to the subsets in the Venn diagram of panel A. Within each density category the histograms are ordered with $\hat{W}$ on the left, $C$ centre, and $M$ on the right. Panel D corresponds to separate heat-maps (blue-purple-red, lowest to highest) for three distinct statistical measures (for the rankings of the LSOAs within each density category), from top to bottom: the mean ranking (μ rank %) expressed as a percentual of the total number of LSOAs, the coefficient of variance (CoV) and the fat tail level (ftl) computed as the ratio between the 97.5 and 2.5 centile of the ordered ranking values.

One can assert that the reason for the overlapping results to be so robust is the fact that there is a high level of correlation between the economic, social and environmental variables used by $M$ and $C$ , and that these variables are also highly correlated to the behaviour expressed by the process of house purchases.

Furthermore, additional insight and understanding can be obtained by the detailed analysis of the differences among the $\hat{W}$ , $C$ and $M$ most deprived subsets. Firstly, we adopted a data driven approach to separate LSOAs into distinct categories (Gartner et al., 2011) based on their ranking distribution of densities as shown in Figure 4(B). We identified distinct regions within the curve of density rank against log density by optimising linear and exponential fittings through a very large number of sector combinations. We obtained five classes of density that we described from least to highest as ‘remote’, ‘sparse’, ‘moderate’, ‘dense’ and ‘crowded’. Clearly, one can find a significant overlap between the more traditional Census categorisation of built up areas as well as rural and urban areas to our method. However, we opted to use this bottom-up data driven approach in order to highlight the specific features of the extreme ‘remote’ and ‘crowded’ LSOAs without resorting to additional data.

Through this categorisation, we can make the following observations. Firstly, $M$ is significantly biased towards moderate and dense LSOAs as observed in Figure 4(C). This effectively means that rural (Gartner et al., 2011; Kuttler and Moraglio, 2021) highly deprived LSOAs are almost non-existent in $M$ . In addition, significant pockets of inner London that are traditionally viewed as highly deprived tend to be excluded (Cloke et al., 1995; Lloyd et al., 2023a). This is one of the primary reasons that recent research identified $M$ to under select LSOAs with high level of ethnic minorities (Lloyd et al., 2023a) as there is a significant number of these minorities in the highly dense, crowded, regions. Moreover, Figure 4(D) shows the average rank of $M$ to significantly vary as a function of the geographical density (μ rank %), and that the both ends of the density spectrum $M$ will tend to have low (CoV) and heavy tails (the fat tail level is the ratio between the 97.5 and 2.5 centile of the values). In contrast, $C$ does select a significant number of crowded LSOAs, but it fundamentally tends to disregard extreme rural deprivation. Indeed, Figure 4(D) (μ rank %) expresses a monotonic behaviour which is a clear indication of dependency on density.

Lastly, $\hat{W}$ tends to be more balanced across the density spectrum, even though it shows some moderate level of bias towards the most crowded LSOAs.

Analysis of the remote and sparse areas

The remote and sparse areas account for 16% of the total number of LSOAs but 87% of the total area of England. Given these numbers, it is highly surprising that both $C$ and $M$ rarely register a highly deprived area within these categories. Indeed, we would argue here that this analysis poses significant questions as to whether these indexes are fit for the purpose of capturing rural deprivation, as it is not unreasonable to speculate that the nature of rural deprivation is very distinct from that of urban deprivation (Gartner et al., 2011; Kuttler and Moraglio, 2021). As observed within the map in Figure 5, $\hat{W}$ captures highly deprived LSOAs in a widely distributed manner across the whole country, with some pockets in Wiltshire, Herefordshire and Shropshire. In contrast $C$ and $M$ are limited to very few coastal locations and near city areas.

Figure 5.

Geographical distribution of $\hat{W}$ , $C$ and $M$ across England for the remote and sparse LSOA areas and analysis of selected deprivation indicators affecting these categories. The map at the top shows all remote and sparse areas within England. The light yellow shapes represent LSOAs that are not within the most deprived decile by any of the indexes. Dark blue, dark cyan and dark red represent the most deprived areas for $\hat{W}$ , $C$ and $M$ , respectively. The histogram at the bottom show the centile position of the ranking for the median of the Income (domain), Geographical Barriers (subdomain) and Poor Housing (indicator) measures that can be found within the construction of $M$ . The darker shades relate to both remote and sparse regions, and lighter shades contain only the remote ones. Blue, cyan and red colour tones represent $\hat{W}$ , $C$ and $M$ , respectively.

The existing bias against remote and sparse areas can be explained by the fact that $M$ is primarily driven by the highly correlated measures of income and employment (with a combined weighting of 63%), and $C$ also has a 50% implied weighting to the same type of dimensions. Essentially, this means that areas that are significantly impacted by a more social perspective, such as those areas with high geographical barriers and extremely poor housing, are not captured as highly deprived, since the social components are diluted by primary economic and monetary dimensions. In contrast, the social component seems to affect house purchase behaviour, and therefore these are indirectly captured by $\hat{W}$ . The histograms in Figure 5 illustrate these observations. The very few LSOAs categorised as highly deprived by $M$ are those that have very high rank of income deprivation, but not higher ranking levels for the Geographical Barriers subdomain and the Poor Housing indicator even though both of them are part of the construction of the $C$ and the $M$ indexes. In contrast, LSOAs within remote and sparse areas that exhibit extreme high levels of geographical barriers and poor housing are captured and classified as highly deprived by $\hat{W}$ even if the average monetary and economic features are positive.

It is important to emphasise that we are not judging whether one index performs better than another, as this is a relative element and it will depend on usage. Here, we are solely emphasising the positive and negative bias of each index, and the impact to different areas of social and economic structure.

Analysis of the crowded and dense categories

Both our Behavioural House Indicator $\hat{W}$ and the 2021 Census data $C$ show a tendency to capture higher numbers of highly deprived crowded areas, whereas English Index of Multiple Deprivation $M$ tends to manifest the reverse. This feature is well illustrated by the maps in Figure 6 for the cities of London, Birmingham and Leicester. Whereas $\hat{W}$ and $C$ are well represented, and have high levels of overlap in all these cities, $M$ is only significantly present in Birmingham (where it significantly overlaps with the others).

Figure 6.

Geographical distribution of $\hat{W}$ , $C$ and $M$ for the crowded category in selected urban areas and analysis of selected deprivation indicators affecting dense and crowded regions. Each of the three maps at the top show the distribution of the crowded areas within London (top right), Leicester (to left) and Birmingham (middle left). The colour representations for each index $\hat{W}$ , $C$ and $M$ and their combinations are shown at the scheme on the top left. The map at the bottom right corresponds to zoomed in cut for the London central areas at the south of the river Thames, and it excludes the LSOA crowded areas that are not regarded as highly deprived by any measure (light yellow in the other maps). The histogram at bottom left shows the centile position of the ranking for the median of the Income (domain), Housing Overcrowding (indicator), Housing Affordability (indicator) and Air quality (indicator) measures that can be found within the construction of $M$ . The darker shades relate to both dense and crowded regions, and lighter shades contain only the crowded ones. Blue, cyan and red colour tones represent $\hat{W}$ , $C$ and $M$ , respectively.

Similarly to the analysis for remote and sparse regions, $M$ existing bias against dense and crowded areas can be explained by the fact that the index is primarily driven by low income and unemployment. As shown by the histogram within Figure 6 areas with significant high levels of House Overcrowding, Housing (Low) Affordability and (Bad) Air Quality are not captured as highly deprived by $M$ , and are diluted by primary economic and monetary measures that do not take into account the relative perspective of money and quality of employment. In contrast, both $\hat{W}$ and $C$ capture these features very well within crowded regions.

Here, we highlight the fundamental issue that tends to arise from monetary and economic measures such as the income and employment which is best encapsulated by the expression struggling to make ends meet. The population may be employed and have proportionally higher levels of income, and yet these factors are not enough to provide for basic needs as affordability (where housing plays a fundamental part) is extremely low. Therefore, we would argue that $M$ does not provide enough weight to deprivation indicators in urban areas where the quality of employment and relative purchase power of earning have more complex facets.

Comparing and contrasting $\hat{W}$ and $C$ also provides interesting additional insights. Firstly, we have already highlighted the high levels of overlap in cities such as Birmingham and Leicester as well as London, where the inner boroughs of Tower Hamlets, Newham, Hackney and Haringey, traditionally associated with high levels of deprivation, being well captured by both methods. Significant divergence between $\hat{W}$ and $C$ , however, can be founded at the crowded LSOAs on the south bank of the river Thames (primarily the boroughs of Lambeth and Southwark) as observed in the zoomed map at the bottom right of Figure 6. On the south bank, $\hat{W}$ has a much wider LSOA representation, covering areas that are normally associated with high deprivation levels, more specially areas that were identified as initial flash points during the 2011 London riots.

Discussion

The observations made in our Results section above indicate that our proposed Behavioural House Indicator $\hat{W}$ can be reasonably used as a faster, real-time, simple measure for deprivation in England and Wales. Importantly, it can be feasibly be published on a monthly basis (albeit we would argue an annual or semi-annual is sufficient as changes to deprivation are slow) with very short time delay as the computation of $\hat{W}$ is solely based on a single and reliable, precise, factual and objective available data source from the UK Government which is updated at very frequent intervals. In contrast the 2021 Census data $C$ and the English Index of Multiple Deprivation $M$ are indexes that costly and expensive and published very sporadically. The Census follows a ten year cycle, whereas $M$ is dependent on specific government with the next only planned for 2025 (with the previous ones being 2019 and 2015). As it relies on specific data gathering processes, and a number of re-calibrations and readjustments. Within this context, we would argue that $\hat{W}$ is an ideal candidate to be added as a social barometer to the ONS list of faster indicators of economic activity.

Importantly, our results do not advocate for the replacement of any of the indexes above. Instead of we are able to show that each index performs well in different aspects of deprivation, a fact extensively researched in other recent papers (Lloyd et al., 2023b; Norman, 2015). We would argue that it is the combination of all these index that can provide a much richer picture both the levels of deprivation as well as the nature of deprivation. Indeed, $\hat{W}$ could be feasibly ‘calibrated’ to the outcomes of $M$ , and be used as indicator of the latter between periods of publication.

This is an important assertion given the existing speculation on scraping the census in 2031, and relying instead on a network of disparate public sector sources of data. Whereas our research shows that there is much to gain from making use of public sector sources of data as proxy to the existing economic and social measures, the authors do not agree with any potential initiative for entirely scrapping of the Census.

Having said the above, our comparative analysis together with other important recent research, poses a question as to whether $M$ should be simplified (as a result of the various levels of correlation), less subjective (weightings and transformations) and re-calibrated to better capture the elements of rural, urban inequality and ethnic minority deprivation.

Our results also add some robust evidence to the finding of recent research on the issues surrounding the capture of deprivation in areas with large ethnic minorities (Lloyd et al., 2023a). Our proposed indicator improves the identification of these areas.

We also point out an important feature of $\hat{W}$ . The indicator can be easily computed and used at different levels of granularity and aggregation, be based on existing categories (MSOA, electoral wards, etc.) or any new form of clustering. Whereas this feature is also present for $C$ (albeit at output area and not as granular as postcode level) it is not the case for $M$ which is rigidly structured at LSOA level. This flexibility may be of great help to research geared towards features emerging from different scaling and geographical levels.

We also believe that the fact that the data is sourced independently from any economic and social data collection exercise to be of significant benefit for comparative analysis of performance and results, and potentially a source of quality control checks and balances. From a practical point of view, the ability to coordinate these potential checks and balances may be even easier now given that as recently as the June 2023, the sponsorship of HM Land Registry and its associated bodies moved from the Department for Business and Trade to the department of DLUHC (Department for Levelling Up, Housing and Communities). This essentially means data from the HM Land Registry and those of $M$ reside within a single overarching governance.

Lastly, we would emphasise two specific limitations to our work. Firstly, we emphasise that we made use of the word Indicator instead of an Index to emphasise that this is a statistically and data driven method that is subject to some level of individual mis-classifications and errors due to either local level specific dynamics or insufficient statistical data (whether $M$ should also be an indicator as opposed to an index is muted point). Indeed, the neighbourhood method described in is precisely used to reduce data deviations. However, it is possible to construct more advance and detailed methods that may reduce uncertainty even further. Secondly, we believe that it is possible, desirable and useful to enhance the current methodology to calculate $\hat{W}$ by taking into account additional property information (freehold and leasehold for instance) as well as geographically specific information (location of religion sites, or council housing for instance). In this way, our research to date may be regarded as an initial point to expand the potential of embedded non-monetary information that is found within public house transactions an housing data sets as well as other public sector sources of data.

Supplemental Material

Supplemental material - The behavioural house indicator: A faster and real time small-area deprivation measure for England

Supplemental Material for The behavioural house indicator: A faster and real time small-area deprivation measure for England by Eduardo Viegas and Tim S. Evans

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Eduardo Viegas

Tim S. Evans

Data Availability Statement

All data used within this research is publicly available and can be sourced from the UK Government open data website, , within the terms indicated below.

Supplemental Material

Supplemental material for this article is available online.

Eduardo Viegas is an Associate Professor at the Department of Computer Science within the Tokyo Institute of Technology and a researcher at the Centre for Complexity Science at Imperial College London. His reserarch is centred on the social, financial and economic aspects of complex systems with emphasis on methodologies derived from network theory and information theory. His present focus is on population mobility and the socio-economic structural features of societies in the UK and Japan. Before moving to academia, he worked for over 30 years within the global financial services industry.

Tim S. Evans is a Senior Lecturer at Imperial College London in the Theoretical Physics Group and a member of the Social and Cultural Analytics Lab within the Data Science Institute. He also worked as a reseracher at the University of Alberta in Edmonton Canada. He has a long-standing collaboration in archeology leading to the development of geographical models for both ancient and modern contexts. His present focus is on the analysis of complex networks in the presence of constraints such as geographical space or the arrow of time.

References

Anand

Jones

Donoghue

, et al. (2021) Non-monetary poverty and deprivation: a capability approach. Journal of European Social Policy 31(1): 78–91.

Biggeri

Ferrari

(2010) Price indexes in time and space methods and practice. Contributions to Statistics, 1st edition. Heidelberg: Physica-Verlag HD.

Chateauneuf

Cornet

(2022) The risk-neutral non-additive probability with market frictions. Economic theory bulletin 10(1): 13–25.

Cloke

Goodwin

Milbourne

, et al. (1995) Deprivation, poverty and marginalization in rural lifestyles in England and Wales. Journal of Rural Studies 11(4): 351–365.

Cohen

Davis

Samorodnitsky

(2020) Heavy-tailed distributions, correlations, kurtosis and Taylor’s law of fluctuation scaling. Proceedings of the Royal Society A: Mathematical, Physical & Engineering Sciences 476(2244): 20200610.

Davidson (1976) HSocial deprivation: an analysis of intercensal change. Transactions of the Institute of British Geographers 1(1): 108–117.

Dehmer

Emmert-Streib

Mehler

(2011) The central role of information theory in ecology. In towards an Information Theory of Complex Networks: Statistical Methods and Applications. 1st edition. Boston: Birkhäuser Boston, 153–167.

Gartner

Farewell

Roach

, et al. (2011) Rural/urban mortality differences in England and Wales and the effect of deprivation adjustment. Social Science & Medicine 72(10): 1685–1694.

Green

Daras

Davies

, et al. (2018) Developing an openly accessible multi-dimensional small area index of ‘access to healthy assets and hazards’ for Great Britain, 2016. Health & Place 54: 11–19. DOI: 10.1016/j.healthplace.2018.08.019.

10.

Haris

(1973) Some Aspects of Social polarisation. London: Urban Patterns, Problems and Policies. London: Heinemann, 156–189.

11.

Huang

Ulanowicz

(2014) Ecological network analysis for economic systems: growth and development and implications for sustainable development. PLoS One 9(6): e100923–e100928. DOI: 10.1371/journal.pone.0100923.

12.

Kuttler

Moraglio

(2021) Re-thinking mobility poverty: understanding users’ geographies, backgrounds and aptitudes. In: Transport and Society. Abingdon, Oxon: Routledge, Taylor & Francis Group.

13.

Lloyd

Catney

Wright

, et al. (2023a) An ethnic group specific deprivation index for measuring neighbourhood inequalities in England and Wales. The Geographical Journal 190: 1–22. DOI: 10.1111/geoj.12563.

14.

Lloyd

Norman

McLennan

(2023b) Deprivation in England, 1971–2020. Applied spatial analysis and policy 16(1): 461–484.

15.

Noble

McLennan

, et al. (2019a) Research Report, the English Indices of Deprivation 2019. Ministry of Housing, Communities & Local Government. Retrieved 1 Jan 2024 URL: https://www.gov.uk/government/publications/english-indices-of-deprivation-2019-research-report.

16.

Noble

McLennan

, et al. (2019b) Technical Report, the English Indices of Deprivation 2019. Ministry of Housing, Communities & Local Government. Retrieved 1 Jan 2024 URL: https://www.gov.uk/government/publications/english-indices-of-deprivation-2019-technical-report.

17.

Norman

(2015) The changing geography of deprivation in Britain: exploiting small area census data 1971 to 2011. In: Champion

Falkingham

(eds) Population Change in the United Kingdom. London: Rowman & Littlefield.

18.

ONS (2019) Statistical Release, the English Indices of Deprivation 2019. Ministry of Housing, Communities & Local Government. Retrieved 1 Jan 2024 URL: https://assets.publishing.service.gov.uk/media/5d8e26f6ed915d5570c6cc55/IoD2019_Statistical_Release.pdf.

19.

ONS (2024) Statistical bulletin, economic activity and social change in the UK, real-time indicators: 18 january 2024. URL: https://assets.publishing.service.gov.uk/media/5d8e26f6ed915d5570c6cc55/IoD2019_Statistical_Release.pdf.

20.

Parker

(1973) Some sociological implications of slum clearance programmes. London: Urban Patterns, Problems and Policies. London: Heinemann, p. 248–273.

21.

Rebonato

(2004) Volatility and Correlation: The Perfect Hedger and the Fox. 2nd edition. Chichester: Wiley.

22.

Sanchirico

Fiorentino

(2008) Scale-free networks as entropy competition. Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics 78: 046114.

23.

Shannon

(1948) A mathematical theory of communication. Bell Syst. Techn. J 27: 623–656.

24.

The United Kingdom Census (2021) Ts011, Households by Deprivation Dimensions. ONS. Retrieved 1 Jan 2024 URL: https://www.nomisweb.co.uk/sources/census_2021_bulk.

25.

West

(2017) Scale : the universal laws of growth, innovation, sustainability, and the pace of life in organisms. In: Cities, Economies, and Companies. New York, NY. USA: Penguin Press.

26.

Whitehouse

(2009) Pensions, purchasing-power risk, inflation and indexation. OECD Social, Employment and Migration Working Papers, no.77. Paris: OECD Publishing.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.21 MB

The behavioural house indicator: A faster and real time small-area deprivation measure for England

Abstract

Keywords

Introduction

Methods

Data sources and collection

Property related data

Geographical data

Economic and social indices

Data analysis and construction of the behavioural house indicator W ^

Frequency of transactions and transaction pairing probabilities

Weekday pairing probability

Mutual information

Computing neighbourhood and the average mutual information of neighbours

Generating the behavioural house indicator W ^

Ranking areas and bins

Results

Data analysis: The index foundations

Relationship between the behavioural house indicator W ^ and other deprivation indices

The English Index of Multiple Deprivation M

The census 2021 ‘households by deprivation’ dimensions C

Analysis of the most deprived LSOAs

Analysis of the remote and sparse areas

Analysis of the crowded and dense categories

Discussion

Supplemental Material

Supplemental material - The behavioural house indicator: A faster and real time small-area deprivation measure for England

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

Data Availability Statement

Supplemental Material

References

Supplementary Material

Data analysis and construction of the behavioural house indicator $\hat{W}$

Generating the behavioural house indicator $\hat{W}$

Relationship between the behavioural house indicator $\hat{W}$ and other deprivation indices

The English Index of Multiple Deprivation $M$

The census 2021 ‘households by deprivation’ dimensions $C$