What’s Happened to Burglary, and does Attending Help?

It’s nearly 2023, and in the true spirit of Christmas, I’ve used the cheese/wine filled nowhere time between the holidays to pick up an analytical side-project that’s been irking me for awhile… what’s happened to burglary, and why are we suddenly so excited about mandatory attendance? It’s a thorny question, so I figured I’d document my analysis here… All(ish) of the code is available, so this will hopefully be a useful tutorial for others. As usual, keep in mind this is a blog post, not an academic article - this analysis is fuelled by post-Christmas cheese, and peer reviewed by me after a glass of cherry, so if you want to rely on any of it, replicate it yourself first and make sure it’s right!

This took somewhat longer than I expected (hooray for Christmas breaks), so this will be a series in three parts:

Exploring the data and Linear Models (you’re here!)
Time Series (when I get the time)
Synthetic Controls (one day in the distant future)

So come with me on an analytical adventure through time, as we cast our minds back to the halcyon days of 2015. The pound is worth $1.5 dollars again, Boris Johnson is Mayor of London, and we’re all still fondly thinking back to how great the London Olympics were.

Across policing, the effects of austerity are starting to be acutely felt - having briefly been protected, chief officers are tightening their fiscal belt: swathes of policing real estate are sold off to protect the front-line, which in some places is starting to rely on volunteers and goodwill to avoid looking rather thin.

Meanwhile, crime is changing: traditional crimes like burglary and violence continue their to fall, leading the soon-to-be Prime Minister Theresa May to tell the Police Federation to “stop crying wolf” about cuts while demand shrinks, but policing certainly isn’t feeling it, as an increase in “high-harm”, complex offences more than made up for any shortfall.

The shift towards complex offences was, in my mind, largely unnoticed by the general public, but caused a real shift in core policing demands: traditional theft and violent offences had become ever rarer, while child exploitation, sexual assault and fraud reporting skyrocketted… but the latter group of offences require a far lengthier investigation than the former! The result? More demand, as best illustrated this wonderful graph by Matt Ashby

Matt Ashby’s excellent visualisation of crime pressures by time and force

In the face of these conflicting pressures, policing did something that seemed perfectly reasonable, announcing that officers may not attend every single burglary.

Less than a decade later, we’re now on a very different course. Confidence in policing, which had previously seemed immutable, is down. Investigative performance has seemingly falling off a cliff. And so the NPCC has made the opposite commitment, pledging that every home burglary will now see police attendance, after forces who tested the approach claimed huge “dramatic” crime reductions.

But has performance really declined that quickly, and is attendance to blame? I’ll use this post to explore open police data from the last decade, and use data from pilot forces to see just how dramatic the effect is: just what’s happened to burglary, and will mandatory attendance help?

What’s Happened to Burglary?

We’ll start by exploring how burglary has changed over time - for the purpose of this analysis, we’re focusing on domestic burglary (eg, non including commercial properties), and we’ll use both police recorded crime data (eg, crimes reported to police) and the Crime Survey of England and Wales (a national survey to measure total crime) to compare trends.

On Data Quality

While the UK benefits from crime data that is comparatively clean (I do not envy our American cousins, let alone most of our colleagues in the rest of the world), this is less true for both historical data (eg, going back more than a few years) or the Crime Survey data, which isn’t readily available to the general public: the data is spread through various archive files, websites and reports, and there really isn’t a “single source of truth” for crime counts over time.

The data for this analysis is a few hacked togther files: - Crime outcomes and crime counts by quarter, manually concatenated from the Police Open Data Tables - Crime Survey burglary counts per year, from the most recent ONS crime and justice quarterly

I’m afraid that does mean this analysis isn’t quite as reproducible as I’d like - I had written some code to scrape the Home Office data page and aggregate all the outcomes data, but it’s such a mishmash of ODS,XLSX and various formats that everything fell over and I did it manually. That said, hopefully it’s not too much of a pain to reproduce if you feel the need to.

A few caveats are worth considering:

The Crime Survey can’t always be easily compared to reported crime: police crime data is legal accounting, while the crime survey is asking the general public what happened. Domestic Burglary should be comparable, but this is all a little experimental.
Reproducibility and data cleaning: good analysis should be reproducible - you should be able to run it the same way, with new data. This isn’t true here because of my ugly cleaning.
Outcomes aren’t instant: Crimes take time to solve, and so the most recent outcomes data will probably change. I’ve used outcomes per quarter, which should work, but the most recent data is likely to change.

Code

import pandas as pd
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import patsy
import statsmodels.api as sm
import plotly.io as pio
from linearmodels import PanelOLS

pio.templates.default = "plotly_dark"


police_crime_df = pd.read_csv("../data/external/police_recorded_crime_combined.csv")


yearly_police_df = police_crime_df[['Financial Year','Number of Offences']].groupby('Financial Year').sum().reset_index()
yearly_police_df['year'] = range(2002,2023,1)
yearly_police_df = yearly_police_df.rename(columns={"Number of Offences":'police'})

csew_count = pd.read_csv("../data/external/csew_burglary_trends.csv").rename(columns={'Domestic burglary':'csew'})



csew_count['year'] = range(2002,2021,1)
csew_count['csew'] = csew_count['csew']*1000

comparison_df = csew_count.merge(yearly_police_df,how='left',on='year').rename(columns={'csew':'Crime Survey',
                                                                                        'police':'Police Reported'})

comparison_df[['year','Crime Survey','Police Reported']].set_index('year')

	Crime Survey	Police Reported
year
2002	1423000	437583
2003	1357000	402345
2004	1310000	321507
2005	1059000	300517
2006	1025000	292260
2007	1006000	280696
2008	960000	284431
2009	992000	268606
2010	917000	258165
2011	1033000	245312
2012	922000	227276
2013	887000	211988
2014	781000	196554
2015	784000	194700
2016	697000	206051
2017	650000	309867
2018	691000	295556
2019	699000	268715
2020	582000	196214

The Crime Survey data is the best measure for long term crime trends - as it isn’t affected by reporting or policing practice. Taken in isolation, the most obvious takeaway here is that burglary is lower than it’s every been. If you acccept the Peelian view that “the test of police efficiency is the absence of crime and disorder”, policing is doing an excellent job. So why is police reported crime comparatively high, and why are perceptions of police performance seemingly so low?

Code

tidy_yearly = comparison_df[['year','Crime Survey','Police Reported']].melt(id_vars='year')

fig = px.line(tidy_yearly, x='year', y='value', color='variable',color_discrete_sequence=['red','blue'])

fig.update_layout(
    title_text="Domestic Burglaries, Police Reported and Crime Survey",
        legend=dict(
    orientation="h",
    yanchor="bottom",
        y=-0.2),
)

# Set x-axis title
fig.update_xaxes(title_text="Year")

# Set y-axes titles
fig.update_yaxes(title_text="Burglaries")

fig.show()

Comparing Crime Survey to Police Recorded crime counts is generally considered bad form: while a police officer might always know the difference between a robbery, a burglary, and an affray to enable a theft, that’s not something most of the public worries about.

Thankfully, that shouldn’t be a problem for domestic burglary - most people know when they’ve been burgled! That means we can produce a ratio of crime survey burglaries to police burglaries, and obtain a rough estimate of what proportion of burglaries are reported to police over time.

Code

comparison_df['Proportion of Burglaries Reported'] = comparison_df['Police Reported'] / comparison_df['Crime Survey'] * 100

tidy_yearly = comparison_df[['year','Crime Survey','Police Reported','Proportion of Burglaries Reported']].melt(id_vars='year')


# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

csew_burglary = tidy_yearly[tidy_yearly['variable'] == 'Crime Survey']
police_burglary = tidy_yearly[tidy_yearly['variable'] == 'Police Reported']
reported_ratio = tidy_yearly[tidy_yearly['variable'] == 'Proportion of Burglaries Reported']



fig.add_trace(
    go.Scatter(x=csew_burglary['year'], y=csew_burglary['value'], name="Crime Survey", line=dict(color='red')),
    secondary_y=False,
)

fig.add_trace(
    go.Scatter(x=police_burglary['year'], y=police_burglary['value'], name="Police Reported", line=dict(color='blue')),
    secondary_y=False,
)

fig.add_trace(
    go.Scatter(x=reported_ratio['year'], y=reported_ratio['value'], name="% Reported", line=dict(color='grey')),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Crime Survey Burglaries per Year and Proportion Reported",
    legend=dict(
    orientation="h",
    yanchor="bottom",
        y=-0.2),
)

# Set x-axis title
fig.update_xaxes(title_text="Year")

# Set y-axes titles
fig.update_yaxes(title_text="Burglaries", secondary_y=False, color='black')
fig.update_yaxes(title_text="Proportion of Burglaries Reported", secondary_y=True, showgrid=False, color='grey')


fig.show()

I’d once thought burglary reporting would be consistent - it’s an emotive crime, with a well established insurance industry - and yet we definitely see variation over time.

Notice that dramatic spike in 2015? That’s the aftermath of the 2014 police crime recording scandal, which led to police crime statistics losing their designation as a national statistic, and accurate crime recording becoming a key metric for the policing regulator. To dig into this issue in more depth, I’d really recommend this blog by Gavin Hales.

What does this tell us about burglary in England & Wales over the last decade though? Two things:

Domestic Burglaries are probably rarer than ever (and that’s not down to COVID)
That’s not consistently reflected in police recorded crime - the likelihood of reporting seems to change over time

How Many Burglars are we Catching?

So there aren’t many burglaries…so have we gotten better at catching the rest?

Well…we’re certainly not catching any more than we used to. Whether you use charges (eg, burglars being sent to court) or “detections” (eg, including cautions, fines and similar), there are far fewer, thought it looks like that trend started long before 2015.

Code

detections = pd.read_csv('../data/interim/burglary_detections.csv',index_col=0)
detections

fig = px.line(detections, x='year', y='detections')

fig.update_xaxes(title_text="Year")

# Set y-axes titles
fig.update_yaxes(title_text="Police Detections", range=[0,60000])


fig.update_layout(
    title_text="Police Detections by Year"
)

fig.show()

That’s not hugely surprising: there are fewer burglars around to be caught. What’s perhaps more important is how many burglars we’re catching as a proportion of those remaining burglaries we know about.

I’ve visualised that below: on the left axis, all police reported burglaries (in blue), and how many lead to a positive outcome (in red), and on the right axis, the ratio of the two: an estimate of the proportion of police burglaries that are “solved”.

Code

yearly_df = detections.merge(comparison_df, how='left', on='year')
yearly_df['detected_csew_ratio'] = yearly_df['detections'] / yearly_df['Crime Survey'] * 100
yearly_df['detected_police_ratio'] = yearly_df['detections'] / yearly_df['Police Reported'] * 100

# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

fig.add_trace(
    go.Scatter(x=yearly_df['year'], y=yearly_df['Police Reported'], name="Police Reported", fillcolor='blue'),
    secondary_y=False,
)

fig.add_trace(
    go.Scatter(x=yearly_df['year'], y=yearly_df['detections'], name="Police Detections",  fillcolor='red'),
    secondary_y=False,
)



fig.add_trace(
    go.Scatter(x=yearly_df['year'], y=yearly_df['detected_police_ratio'], name="Poportion of Police Reported Burglaries Detected", fillcolor='green', visible=True),
    secondary_y=True,
)


# Add figure title
fig.update_layout(
    title_text="Crime Survey Burglaries per Year and Proportion Reported",
    legend=dict(
    orientation="h",
    yanchor="bottom",
        y=-0.2),
)

# Set x-axis title
fig.update_xaxes(title_text="Year")

# Set y-axes titles
fig.update_yaxes(title_text="Burglaries", secondary_y=False)
fig.update_yaxes(title_text="Proportion of Burglaries Solved",range=[0,20], secondary_y=True, showgrid=False, color='green')


fig.show()

It doesn’t look good: performance drops sharply around the time we stopped attending all burglaries in 2015.

Of course, plenty of other things happened in 2015: for instance, we know there was a radical shift in recording standards, where police recorded and invesigated thousands of burglaries which they would have previously not been aware of. So how do we measure how police performance changed irrespective of those recording changes? We can replicate that chart, but instead of using police recorded burglaries, we use all burglaries, as reported by the Crime Survey. rather than just police reported burglaries.

The below chart does exactly that - click on each of the buttons to see how the baseline figure affects perceived investigatory performance.

Code

# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])


fig.add_trace(
    go.Scatter(x=yearly_df['year'], y=yearly_df['Police Reported'], name="Police Reported", fillcolor='blue'),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=yearly_df['year'], y=yearly_df['detections'], name="Police Detections", fillcolor='red'),
    secondary_y=False,
)



fig.add_trace(
    go.Scatter(x=yearly_df['year'], y=yearly_df['detected_police_ratio'], name="Poportion of Police Reported Burglaries Detected", fillcolor='green', visible=False),
    secondary_y=True,
)

fig.add_trace(
    go.Scatter(x=yearly_df['year'], y=yearly_df['detected_csew_ratio'], name="Poportion of Crime Survey Burglaries Detected", fillcolor='black', visible=False),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Crime Survey Burglaries per Year and Proportion Reported"
)

# Set x-axis title
fig.update_xaxes(title_text="Year")

# Set y-axes titles
fig.update_yaxes(title_text="Burglaries", secondary_y=False)
fig.update_yaxes(title_text="Proportion of Burglaries Solved",range=[0,20], secondary_y=True, showgrid=False, color='green')


fig.update_layout(
    legend=dict(
    orientation="h",
    yanchor="bottom",
        y=-0.2),
    updatemenus=[
        dict(
            type="buttons",
            direction="right",
            active=0,
            x=1,
            y=1.2,
            bgcolor='grey',
            bordercolor='white',
            font=dict(color='white'),
            showactive=False,
            buttons=list([
                dict(label="As Proportion of Police Reported",
                     method="update",
                     args=[{"visible": [True, True, True,False]}]),
                dict(label="As Proportion of all Burglaries",
                     method="update",
                     args=[{"visible": [True, True, False,True]}])
            ]),
        )
    ])

fig.show()

Suddenly that sharp drop in 2015 is a lot less convincing. Looking at all burglaries - rather than just those reported to police - the proportion being solved has been going down, but that started happening long before 2015. Around 5% of all burglaries in the crime survey were “solved” in 2010, but by 2015, that had already dropped to around 2.5%, compared to just over 2% in 2020.

What’s the lesson here? The perception of a recent, short term drop in police investigative performance are somewhat overstated. Yes, policing is catching fewer burglars than ever, but burglaries are rarer than ever - those that remain are less likely to be caught than they used to be, but that’s a trend that started long before 2015 (it might even have started as early as 2008).

What explains that trend? We really don’t know. It’s possible that those few remaining burglars are the most persistent and sophisiticated: maybe opportunistic drug users desperate for a fix who were willing to smash your front window in a decade ago have been replaced by professional teams who will do a whole set of flats in minutes. We can’t really tell with the data we’ve got here, but it’s probably not as simple as policing just not trying hard enough.

Will Mandatory Attendance Help?

Given the above, will mandatory attendance actually reduce burglary? As far as I can tell, nobody has properly checked. The Telegraph suggests that burglaries “could fall by 60pc if officers visited every victim”, but given nobody has really rigorously tested the impact of burglary attendance, I’m not totally convinced. Leicestershire attempted to conduct a randomised control trial in 2015, but following press uproar the whole thing was shelved.

There is some good research that suggests it will make some difference: rapid attendance to crimes probably does help solve them, and finding a suspect is the most effective way of solving a burglary. If we randomly assigned some burglaries to be attended, and some to not, those we did attend would be solved more often.

That’s different to mandatory attendance though: where even if the responding officer thinks there isn’t any point (for example, if the burglary was many months ago, or very unlikely to be solved following a phone report), they’ll attend anyway. So how have some reporters calculcated that policy could cut the number of offences by half?

Unhelpfully, the researchers for the Mail and the Telegraph haven’t been very open with sharing their methodology, but it looks like they’ve looked at those forces who already have a mandatory attendance policy, and tried to predict what would happen if that had been expanded nationally.

Three forces in particular are repeatedly mentioned as having introduced mandatory attendance prior to the NPCC mandate:

Northamptonshire since April 2019, though it’s worth noting these are dedicated burglary teams rather than attendance in isolation
Bedfordshire Police using the codename “Operation Maze”, though again, it seems paired with dedicated burglary teams, and may in fact be attendance by forensics staff rather than police offiers. The earliest records are in early 2020 but it’s not clear at all this involves any mandatory attendance, rather than just increased resourcing.
Greater Manchester Police since July 2021

The key question then is did the introduction of mandatory attendance in these three forces make any difference to burglary investigations?

To do that, we’ll calculate: - What was the detection rate for burglary in all forces? - How did the detection rate change after the introduction of mandatory attendance? - How did other, similar forces fare during the same period?

There are real limitations to doing this with public data: we are limited to quarter by quarter analysis, and we don’t actually know exactly when these policies started (or, for that matter, exactly what they include) or who else might have been doing something similar, but we can still try and identify what performance boost (if any) mandatory attendance made for these forces.

Code

theft_df = pd.read_excel("../data/external/burglary_outcomes_combined.ods", engine="odf")
qs = theft_df['Financial year'].str.slice(0,4) + "-Q" + theft_df['Financial quarter'].astype('str')

theft_df['date'] = pd.PeriodIndex(qs, freq='Q').to_timestamp()
theft_df = theft_df.sort_values(by='date',ascending=False)

burglary_df = theft_df[theft_df['Offence Subgroup'] == 'Domestic burglary'].rename(columns={'Force Name':'Force'})
burglary_df['Force'] = burglary_df['Force'].astype('string')
burglary_df

	Financial year	Financial quarter	Force	Offence Description	Offence Group	Offence Subgroup	Offence Code	Offence code expired	Outcome Description	Outcome Group	Outcome Type	Force outcomes for offences recorded in quarter	Force outcomes recorded in quarter	date
160811	2021/22	4	British Transport Police	Attempted Burglary Residential	Theft offences	Domestic burglary	28F	NaN	Community Resolution	Out-of-court (informal)	8	0.0	0.0	2021-10-01
164083	2021/22	4	Hampshire	Attempted Distraction Burglary Residential	Theft offences	Domestic burglary	28H	NaN	Taken into consideration	Taken into consideration	4	0.0	0.0	2021-10-01
164081	2021/22	4	Hampshire	Attempted Distraction Burglary Residential	Theft offences	Domestic burglary	28H	NaN	Diversionary, educational or intervention acti...	Diversionary, educational or intervention acti...	22	0.0	0.0	2021-10-01
164080	2021/22	4	Hampshire	Attempted Distraction Burglary Residential	Theft offences	Domestic burglary	28H	NaN	Further investigation to support formal action...	Further investigation to support formal action...	21	0.0	0.0	2021-10-01
164079	2021/22	4	Hampshire	Attempted Distraction Burglary Residential	Theft offences	Domestic burglary	28H	NaN	Responsibility for further investigation trans...	Responsibility for further investigation trans...	20	0.0	0.0	2021-10-01
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
49689	2017/18	1	Northamptonshire	Attempted Burglary Residential	Theft offences	Domestic burglary	28F	NaN	Further investigation to support formal action...	Further investigation to support formal action...	21	1.0	0.0	2017-01-01
49688	2017/18	1	Northamptonshire	Attempted Burglary Residential	Theft offences	Domestic burglary	28F	NaN	Responsibility for further investigation trans...	Responsibility for further investigation trans...	20	1.0	1.0	2017-01-01
49687	2017/18	1	Northamptonshire	Attempted Burglary Residential	Theft offences	Domestic burglary	28F	NaN	Caution – youths	Out-of-court (formal)	2	0.0	0.0	2017-01-01
49686	2017/18	1	Northamptonshire	Attempted Burglary Residential	Theft offences	Domestic burglary	28F	NaN	Investigation complete – no suspect identified	Investigation complete – no suspect identified	18	114.0	104.0	2017-01-01
50576	2017/18	1	South Wales	Distraction Burglary Residential	Theft offences	Domestic burglary	28G	NaN	Community Resolution	Out-of-court (informal)	8	0.0	0.0	2017-01-01

212960 rows × 14 columns

Given we’re looking at quarterly data, we’ll generate two temporal variables for our data:

A running variable (running_var), which measures the number of quarters that have passed since the start of our dataset
A quarter “categorical variable”, checking to see if there is any specific trend for burglary detection whether you’re in quarter 1, 2, 3 or 4

Code


time_series = theft_df[['date','Financial quarter']].drop_duplicates().sort_values(by=['date']).reset_index(drop=True).reset_index().rename(columns={'index':'running_var'})
time_series['quarter'] = time_series['Financial quarter'].astype('string')
time_series = time_series.drop(columns=['Financial quarter'])
time_series

	running_var	date	quarter
0	0	2017-01-01	1
1	1	2017-04-01	2
2	2	2017-07-01	3
3	3	2017-10-01	4
4	4	2018-01-01	1
5	5	2018-04-01	2
6	6	2018-07-01	3
7	7	2018-10-01	4
8	8	2019-01-01	1
9	9	2019-04-01	2
10	10	2019-07-01	3
11	11	2019-10-01	4
12	12	2020-01-01	1
13	13	2020-04-01	2
14	14	2020-07-01	3
15	15	2020-10-01	4
16	16	2021-01-01	1
17	17	2021-04-01	2
18	18	2021-07-01	3
19	19	2021-10-01	4

For each force, we then have a count of burglaries, and a ratio of how many solved burglaries (eg, burglary investigations where a burglary is identified and dealt with by police) we have that quarter.

Code

total_q_offences = burglary_df[['date','Force','Force outcomes for offences recorded in quarter']].groupby(['date','Force']).sum().reset_index()
total_q_offences['total_offences'] = pd.to_numeric(total_q_offences['Force outcomes for offences recorded in quarter'],errors='coerce')

positive_outcomes = ['Taken into consideration',
 'Charged/Summonsed',
 'Community Resolution',
 'Cannabis/Khat Warning',
 'Penalty Notices for Disorder',
 'Caution – adults',
 'Diversionary, educational or intervention activity, resulting from the crime report, has been undertaken and it is not in the public interest to take any further action.',
 'Caution – youths']

burglary_df['is_detected'] = burglary_df['Outcome Description'].isin(positive_outcomes)

positive_burglary_df = burglary_df[burglary_df['is_detected']]

total_detected_offences = positive_burglary_df[['date','Force','Force outcomes for offences recorded in quarter']].groupby(['date','Force']).sum().reset_index()
total_detected_offences['total_detected'] = pd.to_numeric(total_detected_offences['Force outcomes for offences recorded in quarter'],errors='coerce')

force_comparison = total_q_offences[['date','Force','total_offences']].merge(total_detected_offences, how='left',on=['date','Force']).drop(columns=['Force outcomes for offences recorded in quarter'])
force_comparison['detected_rate'] = force_comparison['total_detected'] / force_comparison['total_offences']
force_comparison['detected_rate']  = force_comparison['detected_rate'].fillna(0)
force_comparison

	date	Force	total_offences	total_detected	detected_rate
0	2017-01-01	Avon and Somerset	2151.0	112.0	0.052069
1	2017-01-01	Bedfordshire	1142.0	120.0	0.105079
2	2017-01-01	British Transport Police	0.0	0.0	0.000000
3	2017-01-01	Cambridgeshire	1109.0	99.0	0.089270
4	2017-01-01	Cheshire	790.0	80.0	0.101266
...	...	...	...	...	...
875	2021-10-01	Warwickshire	376.0	8.0	0.021277
876	2021-10-01	West Mercia	857.0	8.0	0.009335
877	2021-10-01	West Midlands	4061.0	93.0	0.022901
878	2021-10-01	West Yorkshire	2406.0	81.0	0.033666
879	2021-10-01	Wiltshire	265.0	8.0	0.030189

880 rows × 5 columns

Now, we identify our “treated” forces, and the period in which the policy was in place. This isn’t exactly clean, but based on the newspaper articles and press releases, I’ve picked out:

Bedfordshire from 2020 onwards
GMP from July 2021 onwards
Northamptonshire from April 2019 onwards
and of course, nationwide from October 2022

With those dates, we’ll then build our completed dataset of quarter by quarter burglary detection rates, for each force, and whether or not a mandatory attendance policy was in place in that force/quarter.

Code

northamptonshire = (force_comparison['Force'].str.contains('Northamptonshire')) & (force_comparison['date'] >= '2019-04-01')
bedfordshire = (force_comparison['Force'].str.contains('Bedfordshire')) & (force_comparison['date'] >= '2020-01-01')
manchester = (force_comparison['Force'].str.contains('Manchester')) & (force_comparison['date'] >= '2020-07-01')
everyone = (force_comparison['date'] >= '2022-10-01')


force_comparison['mandatory_attendance'] = northamptonshire | bedfordshire | manchester | everyone
force_comparison

	date	Force	total_offences	total_detected	detected_rate	mandatory_attendance
0	2017-01-01	Avon and Somerset	2151.0	112.0	0.052069	False
1	2017-01-01	Bedfordshire	1142.0	120.0	0.105079	False
2	2017-01-01	British Transport Police	0.0	0.0	0.000000	False
3	2017-01-01	Cambridgeshire	1109.0	99.0	0.089270	False
4	2017-01-01	Cheshire	790.0	80.0	0.101266	False
...	...	...	...	...	...	...
875	2021-10-01	Warwickshire	376.0	8.0	0.021277	False
876	2021-10-01	West Mercia	857.0	8.0	0.009335	False
877	2021-10-01	West Midlands	4061.0	93.0	0.022901	False
878	2021-10-01	West Yorkshire	2406.0	81.0	0.033666	False
879	2021-10-01	Wiltshire	265.0	8.0	0.030189	False

880 rows × 6 columns

One limitation of working by quarters is that we’re quite limited in numbers! Good inference in statistical modelling relies on detecting “variance” - eg, having enough data to identify both our effect, and general background levels, and separating out the two. Given GMP has only been running for 6 quarters, we’ll focus on the other two forces for this analysis.

Code

force_comparison[['Force','mandatory_attendance']].groupby('Force').sum().sort_values(by='mandatory_attendance', ascending=False).rename(columns={"mandatory_attendance":'Quarters of Mandatory Attendance'})

	Quarters of Mandatory Attendance
Force
Northamptonshire	11
Bedfordshire	8
Greater Manchester	6
South Wales	0
Merseyside	0
Metropolitan Police	0
Norfolk	0
North Wales	0
North Yorkshire	0
Northumbria	0
Nottinghamshire	0
Avon and Somerset	0
London, City of	0
Staffordshire	0
Suffolk	0
Surrey	0
Sussex	0
Thames Valley	0
Warwickshire	0
West Mercia	0
West Midlands	0
West Yorkshire	0
South Yorkshire	0
Lincolnshire	0
Leicestershire	0
Lancashire	0
British Transport Police	0
Cambridgeshire	0
Cheshire	0
Cleveland	0
Cumbria	0
Derbyshire	0
Devon and Cornwall	0
Dorset	0
Durham	0
Dyfed-Powys	0
Essex	0
Gloucestershire	0
Gwent	0
Hampshire	0
Hertfordshire	0
Humberside	0
Kent	0
Wiltshire	0

Now we can start combining our data together, and seeing if we can observe any meaningful difference after the intervention takes place.

For meaningful comparisons, we’ll seek to compare forces to other similar forces - there’s not much point in comparing GMP to Durham. We could do that with our existing data - eg, looking at forces with similar numbers of burglaries - but helpfully, the policing regulator HMIFRS already produces “Most Similar Forces” groups, which should let us match up forces based on size, demographics and other factors relevant to performance. We’ll seperate out our group A (Bedfordshire and their group) from group B (Northamptonshire’s), though notice Kent is in both - unhelpfully, HMIC’s groups aren’t exclusive, though that’s not too much of a problem.

Code

group_a = ["Bedfordshire",
"Leicestershire",
"Nottinghamshire",
"Hertfordshire",
"Kent",
"Hampshire",
"Essex",
"South Yorkshire"]

group_b = ["Northamptonshire",
           "Cheshire",
"Derbyshire",
"Staffordshire",
"Kent",
"Avon and Somerset",
"Essex",
"Nottinghamshire"]

msf_groups = force_comparison[['Force']].drop_duplicates().reset_index(drop=True)
msf_groups['group_A'] = msf_groups['Force'].isin(group_a)
msf_groups['group_B'] = msf_groups['Force'].isin(group_b)
msf_groups

	Force	group_A	group_B
0	Avon and Somerset	False	True
1	Bedfordshire	True	False
2	British Transport Police	False	False
3	Cambridgeshire	False	False
4	Cheshire	False	True
5	Cleveland	False	False
6	Cumbria	False	False
7	Derbyshire	False	True
8	Devon and Cornwall	False	False
9	Dorset	False	False
10	Durham	False	False
11	Dyfed-Powys	False	False
12	Essex	True	True
13	Gloucestershire	False	False
14	Greater Manchester	False	False
15	Gwent	False	False
16	Hampshire	True	False
17	Hertfordshire	True	False
18	Humberside	False	False
19	Kent	True	True
20	Lancashire	False	False
21	Leicestershire	True	False
22	Lincolnshire	False	False
23	London, City of	False	False
24	Merseyside	False	False
25	Metropolitan Police	False	False
26	Norfolk	False	False
27	North Wales	False	False
28	North Yorkshire	False	False
29	Northamptonshire	False	True
30	Northumbria	False	False
31	Nottinghamshire	True	True
32	South Wales	False	False
33	South Yorkshire	True	False
34	Staffordshire	False	True
35	Suffolk	False	False
36	Surrey	False	False
37	Sussex	False	False
38	Thames Valley	False	False
39	Warwickshire	False	False
40	West Mercia	False	False
41	West Midlands	False	False
42	West Yorkshire	False	False
43	Wiltshire	False	False

Let’s start with a simple visual observation: we look at the rate of detection per outcome for each force and group, with the start of mandatory attendance visible as the dashed vertical line.

Code

linear_comparator_df = force_comparison.merge(msf_groups, how='right', on='Force')
linear_comparator_df = linear_comparator_df[linear_comparator_df['group_A'] | linear_comparator_df['group_B']].merge(time_series, how='left', on='date')

linear_comparator_df['detected_rate'] = linear_comparator_df['detected_rate']  * 100

df_a = linear_comparator_df[linear_comparator_df['group_A']]
df_b = linear_comparator_df[linear_comparator_df['group_B']]

fig = make_subplots(rows=2, cols=1)

for force in df_a['Force'].unique():
    force_df = df_a[df_a['Force'] == force]
    if force == 'Bedfordshire':
        fig.add_trace(
            go.Scatter(
            x=force_df['date'], y=force_df['detected_rate'], name=force, line=dict(color='red', width=4,
                              dash='dash')),
            row=1, col=1
        )
    else:
        force_df = df_a[df_a['Force'] == force]
        fig.add_trace(
            go.Scatter(
            x=force_df['date'], y=force_df['detected_rate'], name=force),
            row=1, col=1
        )



for force in df_b['Force'].unique():
    force_df = df_b[df_b['Force'] == force]
    if force == 'Northamptonshire':
        fig.add_trace(
            go.Scatter(
            x=force_df['date'], y=force_df['detected_rate'], name=force, line=dict(color='blue', width=4,
                              dash='dash')),
            row=2, col=1
        )
    else:
        fig.add_trace(
            go.Scatter(
            x=force_df['date'], y=force_df['detected_rate'], name=force),
            row=2, col=1
        )


fig.add_vline(x='2020-01-01', row=1, col=1, line_dash="dash")
fig.add_vline(x='2019-04-01', row=2, col=1,line_dash="dash")

fig.update_xaxes(title_text="Date")
fig.update_layout(height=700, title_text="Detected Outcomes per Quarter and Force, and start of Mandatory Attendance")

fig.update_yaxes(title_text="Detected Rate")

fig.show()

It looks, in a word, messy. There might be some sort of effect, but it certainly isn’t pronounced enough to visually observe. Instead, let’s use some statistical modelling to try and measure the average effect.

Statistical Modelling

Linear Models with Time Effects

With all our data, we now build a statistical model, predicting the detected outcome rate as a factor of force, their comparison group, their total offences, and the period in time.

Code

linear_comparator_df['Force'] = linear_comparator_df['Force'].astype('category')
linear_comparator_df['quarter'] = linear_comparator_df['quarter'].astype('category')
linear_comparator_df['mandatory_attendance'] = linear_comparator_df['mandatory_attendance'].astype('int')


y = linear_comparator_df['detected_rate']

X = patsy.dmatrix("0 + C(Force) + total_offences + mandatory_attendance + group_A + group_B  + C(quarter) + running_var", data=linear_comparator_df, return_type='dataframe')


model = sm.OLS(y,X)

results = model.fit()

results.summary()

OLS Regression Results
Dep. Variable:	detected_rate	R-squared:	0.551
Model:	OLS	Adj. R-squared:	0.518
Method:	Least Squares	F-statistic:	16.44
Date:	Sat, 31 Dec 2022	Prob (F-statistic):	1.59e-32
Time:	16:25:24	Log-Likelihood:	-426.89
No. Observations:	260	AIC:	891.8
Df Residuals:	241	BIC:	959.4
Df Model:	18
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
C(Force)[Avon and Somerset]	3.6335	0.530	6.861	0.000	2.590	4.677
C(Force)[Bedfordshire]	3.1250	0.322	9.720	0.000	2.492	3.758
C(Force)[Cheshire]	4.9698	0.291	17.099	0.000	4.397	5.542
C(Force)[Derbyshire]	1.3595	0.322	4.226	0.000	0.726	1.993
C(Force)[Essex]	-2.4339	0.274	-8.877	0.000	-2.974	-1.894
C(Force)[Hampshire]	2.8248	0.468	6.031	0.000	1.902	3.747
C(Force)[Hertfordshire]	1.4803	0.277	5.341	0.000	0.934	2.026
C(Force)[Kent]	-3.5083	0.277	-12.671	0.000	-4.054	-2.963
C(Force)[Leicestershire]	2.8885	0.294	9.821	0.000	2.309	3.468
C(Force)[Northamptonshire]	0.6208	0.350	1.774	0.077	-0.068	1.310
C(Force)[Nottinghamshire]	-2.1424	0.419	-5.118	0.000	-2.967	-1.318
C(Force)[South Yorkshire]	3.5578	0.705	5.045	0.000	2.169	4.947
C(Force)[Staffordshire]	3.2984	0.292	11.288	0.000	2.723	3.874
group_A[T.True]	5.7917	0.632	9.165	0.000	4.547	7.037
group_B[T.True]	5.7974	0.455	12.730	0.000	4.900	6.694
C(quarter)[T.2]	-0.0319	0.231	-0.138	0.890	-0.487	0.423
C(quarter)[T.3]	-1.0204	0.259	-3.937	0.000	-1.531	-0.510
C(quarter)[T.4]	-1.0813	0.247	-4.370	0.000	-1.569	-0.594
total_offences	-0.0013	0.000	-2.707	0.007	-0.002	-0.000
mandatory_attendance	0.7574	0.442	1.714	0.088	-0.113	1.628
running_var	-0.1198	0.027	-4.481	0.000	-0.172	-0.067

Omnibus:	1.754	Durbin-Watson:	1.763
Prob(Omnibus):	0.416	Jarque-Bera (JB):	1.487
Skew:	0.174	Prob(JB):	0.475
Kurtosis:	3.128	Cond. No.	7.82e+19

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The smallest eigenvalue is 8.11e-32. This might indicate that there are
strong multicollinearity problems or that the design matrix is singular.

Our model suggests that mandatory attendance may have had, on average, a small effect on performance, this was not significant - eg, it may have been down to sheer luck.

Let’s go a bit further, and add a interaction effect between time and our force variable - eg, accounting for the fact that performance may vary differently over time force, by force.

Code

y = linear_comparator_df['detected_rate']

X = patsy.dmatrix("C(Force)*running_var + total_offences + mandatory_attendance + group_A + group_B  + C(quarter)", data=linear_comparator_df, return_type='dataframe')


model = sm.OLS(y,X)

results = model.fit()

results.summary()

OLS Regression Results
Dep. Variable:	detected_rate	R-squared:	0.616
Model:	OLS	Adj. R-squared:	0.565
Method:	Least Squares	F-statistic:	12.22
Date:	Sat, 31 Dec 2022	Prob (F-statistic):	6.57e-33
Time:	16:25:24	Log-Likelihood:	-406.76
No. Observations:	260	AIC:	875.5
Df Residuals:	229	BIC:	985.9
Df Model:	30
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Intercept	6.2968	0.805	7.827	0.000	4.712	7.882
C(Force)[T.Bedfordshire]	1.6786	0.583	2.879	0.004	0.530	2.827
C(Force)[T.Cheshire]	2.9897	0.959	3.118	0.002	1.101	4.879
C(Force)[T.Derbyshire]	-1.9116	0.867	-2.205	0.028	-3.620	-0.204
C(Force)[T.Essex]	-0.2824	0.500	-0.565	0.573	-1.268	0.703
C(Force)[T.Hampshire]	2.2860	0.537	4.256	0.000	1.228	3.344
C(Force)[T.Hertfordshire]	-0.2522	0.513	-0.492	0.623	-1.263	0.759
C(Force)[T.Kent]	-1.8168	0.490	-3.704	0.000	-2.783	-0.850
C(Force)[T.Leicestershire]	-0.3087	0.483	-0.639	0.523	-1.260	0.643
C(Force)[T.Northamptonshire]	-3.0584	0.941	-3.252	0.001	-4.912	-1.205
C(Force)[T.Nottinghamshire]	-0.6398	0.620	-1.032	0.303	-1.862	0.582
C(Force)[T.South Yorkshire]	0.8367	0.745	1.123	0.263	-0.632	2.305
C(Force)[T.Staffordshire]	0.8295	0.902	0.919	0.359	-0.948	2.607
group_A[T.True]	1.5015	0.429	3.496	0.001	0.655	2.348
group_B[T.True]	2.0563	0.418	4.917	0.000	1.232	2.880
C(quarter)[T.2]	-0.0358	0.220	-0.162	0.871	-0.470	0.399
C(quarter)[T.3]	-1.0428	0.256	-4.077	0.000	-1.547	-0.539
C(quarter)[T.4]	-1.1036	0.240	-4.601	0.000	-1.576	-0.631
running_var	-0.0171	0.056	-0.305	0.761	-0.128	0.093
C(Force)[T.Bedfordshire]:running_var	-0.1477	0.088	-1.670	0.096	-0.322	0.027
C(Force)[T.Cheshire]:running_var	-0.1680	0.070	-2.388	0.018	-0.307	-0.029
C(Force)[T.Derbyshire]:running_var	-0.0334	0.069	-0.487	0.627	-0.168	0.102
C(Force)[T.Essex]:running_var	-0.1593	0.069	-2.320	0.021	-0.295	-0.024
C(Force)[T.Hampshire]:running_var	-0.2698	0.068	-3.966	0.000	-0.404	-0.136
C(Force)[T.Hertfordshire]:running_var	-0.1377	0.068	-2.032	0.043	-0.271	-0.004
C(Force)[T.Kent]:running_var	-0.1106	0.068	-1.629	0.105	-0.244	0.023
C(Force)[T.Leicestershire]:running_var	0.0146	0.068	0.216	0.829	-0.119	0.148
C(Force)[T.Northamptonshire]:running_var	0.0375	0.091	0.413	0.680	-0.141	0.216
C(Force)[T.Nottinghamshire]:running_var	-0.0865	0.068	-1.273	0.204	-0.220	0.047
C(Force)[T.South Yorkshire]:running_var	-0.0445	0.068	-0.657	0.512	-0.178	0.089
C(Force)[T.Staffordshire]:running_var	-0.1167	0.068	-1.707	0.089	-0.251	0.018
total_offences	-0.0012	0.001	-2.279	0.024	-0.002	-0.000
mandatory_attendance	0.3016	0.777	0.388	0.698	-1.229	1.832

Omnibus:	3.734	Durbin-Watson:	2.028
Prob(Omnibus):	0.155	Jarque-Bera (JB):	3.443
Skew:	0.221	Prob(JB):	0.179
Kurtosis:	3.350	Cond. No.	4.44e+18

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The smallest eigenvalue is 2.52e-29. This might indicate that there are
strong multicollinearity problems or that the design matrix is singular.

No matter how we model it, we’re again not finding meaningful effects - mandatory attendance does seem to be associated with slightly higher detected rates, on average, but this isn’t statistically significant. Put simply, examining the “statistical noise” in the rest of our model, we can’t confidently say any increases aren’t just due to random luck.

As one final approach, we’ll use panel data using the LinearModels library - far from the best way to examine how effects change over time, but not a bad starter for ten.

Code

panel_data = force_comparison.set_index(["Force", "date"])
panel_data

		total_offences	total_detected	detected_rate	mandatory_attendance
Force	date
Avon and Somerset	2017-01-01	2151.0	112.0	0.052069	False
Bedfordshire	2017-01-01	1142.0	120.0	0.105079	False
British Transport Police	2017-01-01	0.0	0.0	0.000000	False
Cambridgeshire	2017-01-01	1109.0	99.0	0.089270	False
Cheshire	2017-01-01	790.0	80.0	0.101266	False
...	...	...	...	...	...
Warwickshire	2021-10-01	376.0	8.0	0.021277	False
West Mercia	2021-10-01	857.0	8.0	0.009335	False
West Midlands	2021-10-01	4061.0	93.0	0.022901	False
West Yorkshire	2021-10-01	2406.0	81.0	0.033666	False
Wiltshire	2021-10-01	265.0	8.0	0.030189	False

880 rows × 4 columns

Code

mod = PanelOLS.from_formula(
    "detected_rate ~ 1 + mandatory_attendance + EntityEffects + TimeEffects", data=panel_data
)
mod.fit()

PanelOLS Estimation Summary
Dep. Variable:	detected_rate	R-squared:	0.0013
Estimator:	PanelOLS	R-squared (Between):	-0.0444
No. Observations:	880	R-squared (Within):	-0.0013
Date:	Sat, Dec 31 2022	R-squared (Overall):	-0.0059
Time:	16:25:24	Log-likelihood	1326.8
Cov. Estimator:	Unadjusted
		F-statistic:	1.0290
Entities:	44	P-value	0.3107
Avg Obs:	20.000	Distribution:	F(1,816)
Min Obs:	20.000
Max Obs:	20.000	F-statistic (robust):	1.0290
		P-value	0.3107
Time periods:	20	Distribution:	F(1,816)
Avg Obs:	44.000
Min Obs:	44.000
Max Obs:	44.000

Parameter Estimates
	Parameter	Std. Err.	T-stat	P-value	Lower CI	Upper CI
Intercept	0.0614	0.0019	31.890	0.0000	0.0576	0.0652
mandatory_attendance	0.0155	0.0153	1.0144	0.3107	-0.0145	0.0456

F-test for Poolability: 2.3035
P-value: 0.0000
Distribution: F(62,816)

Included effects: Entity, Time
id: 0x1ecad429930

A similar pattern emerges - we see a small increase associated with mandatory attendance, but it’s non-significant - the increases are too small or too random to say the effect is associated due to attendance, rather than just random data noise.

That said, there are plenty of issues with our approach! We’ve only tried straightforward linear approaches, and given we’re relying on two forces and quarterly data, we’d struggle to detect small effects… hopefully we’ll notice something more stark when I come back to this in a few months for part 2, time series approaches!