What’s Happened to Burglary, and does Attending Help?
policing
crime
data-science
Published
December 29, 2022
It’s nearly 2023, and in the true spirit of Christmas, I’ve used the cheese/wine filled nowhere time between the holidays to pick up an analytical side-project that’s been irking me for awhile… what’s happened to burglary, and why are we suddenly so excited about mandatory attendance? It’s a thorny question, so I figured I’d document my analysis here… All(ish) of the code is available, so this will hopefully be a useful tutorial for others. As usual, keep in mind this is a blog post, not an academic article - this analysis is fuelled by post-Christmas cheese, and peer reviewed by me after a glass of cherry, so if you want to rely on any of it, replicate it yourself first and make sure it’s right!
This took somewhat longer than I expected (hooray for Christmas breaks), so this will be a series in three parts:
Exploring the data and Linear Models (you’re here!)
Time Series (when I get the time)
Synthetic Controls (one day in the distant future)
So come with me on an analytical adventure through time, as we cast our minds back to the halcyon days of 2015. The pound is worth $1.5 dollars again, Boris Johnson is Mayor of London, and we’re all still fondly thinking back to how great the London Olympics were.
Meanwhile, crime is changing: traditional crimes like burglary and violence continue their to fall, leading the soon-to-be Prime Minister Theresa May to tell the Police Federation to “stop crying wolf” about cuts while demand shrinks, but policing certainly isn’t feeling it, as an increase in “high-harm”, complex offences more than made up for any shortfall.
The shift towards complex offences was, in my mind, largely unnoticed by the general public, but caused a real shift in core policing demands: traditional theft and violent offences had become ever rarer, while child exploitation, sexual assault and fraud reporting skyrocketted… but the latter group of offences require a far lengthier investigation than the former! The result? More demand, as best illustrated this wonderful graph by Matt Ashby
But has performance really declined that quickly, and is attendance to blame? I’ll use this post to explore open police data from the last decade, and use data from pilot forces to see just how dramatic the effect is: just what’s happened to burglary, and will mandatory attendance help?
What’s Happened to Burglary?
We’ll start by exploring how burglary has changed over time - for the purpose of this analysis, we’re focusing on domestic burglary (eg, non including commercial properties), and we’ll use both police recorded crime data (eg, crimes reported to police) and the Crime Survey of England and Wales (a national survey to measure total crime) to compare trends.
On Data Quality
While the UK benefits from crime data that is comparatively clean (I do not envy our American cousins, let alone most of our colleagues in the rest of the world), this is less true for both historical data (eg, going back more than a few years) or the Crime Survey data, which isn’t readily available to the general public: the data is spread through various archive files, websites and reports, and there really isn’t a “single source of truth” for crime counts over time.
The data for this analysis is a few hacked togther files: - Crime outcomes and crime counts by quarter, manually concatenated from the Police Open Data Tables - Crime Survey burglary counts per year, from the most recent ONS crime and justice quarterly
I’m afraid that does mean this analysis isn’t quite as reproducible as I’d like - I had written some code to scrape the Home Office data page and aggregate all the outcomes data, but it’s such a mishmash of ODS,XLSX and various formats that everything fell over and I did it manually. That said, hopefully it’s not too much of a pain to reproduce if you feel the need to.
A few caveats are worth considering:
The Crime Survey can’t always be easily compared to reported crime: police crime data is legal accounting, while the crime survey is asking the general public what happened. Domestic Burglary should be comparable, but this is all a little experimental.
Reproducibility and data cleaning: good analysis should be reproducible - you should be able to run it the same way, with new data. This isn’t true here because of my ugly cleaning.
Outcomes aren’t instant: Crimes take time to solve, and so the most recent outcomes data will probably change. I’ve used outcomes per quarter, which should work, but the most recent data is likely to change.
Code
import pandas as pdimport plotly.express as pxfrom plotly.subplots import make_subplotsimport plotly.graph_objects as goimport patsyimport statsmodels.api as smimport plotly.io as piofrom linearmodels import PanelOLSpio.templates.default ="plotly_dark"police_crime_df = pd.read_csv("../data/external/police_recorded_crime_combined.csv")yearly_police_df = police_crime_df[['Financial Year','Number of Offences']].groupby('Financial Year').sum().reset_index()yearly_police_df['year'] =range(2002,2023,1)yearly_police_df = yearly_police_df.rename(columns={"Number of Offences":'police'})csew_count = pd.read_csv("../data/external/csew_burglary_trends.csv").rename(columns={'Domestic burglary':'csew'})csew_count['year'] =range(2002,2021,1)csew_count['csew'] = csew_count['csew']*1000comparison_df = csew_count.merge(yearly_police_df,how='left',on='year').rename(columns={'csew':'Crime Survey','police':'Police Reported'})comparison_df[['year','Crime Survey','Police Reported']].set_index('year')
Crime Survey
Police Reported
year
2002
1423000
437583
2003
1357000
402345
2004
1310000
321507
2005
1059000
300517
2006
1025000
292260
2007
1006000
280696
2008
960000
284431
2009
992000
268606
2010
917000
258165
2011
1033000
245312
2012
922000
227276
2013
887000
211988
2014
781000
196554
2015
784000
194700
2016
697000
206051
2017
650000
309867
2018
691000
295556
2019
699000
268715
2020
582000
196214
The Crime Survey data is the best measure for long term crime trends - as it isn’t affected by reporting or policing practice. Taken in isolation, the most obvious takeaway here is that burglary is lower than it’s every been. If you acccept the Peelian view that “the test of police efficiency is the absence of crime and disorder”, policing is doing an excellent job. So why is police reported crime comparatively high, and why are perceptions of police performance seemingly so low?
Code
tidy_yearly = comparison_df[['year','Crime Survey','Police Reported']].melt(id_vars='year')fig = px.line(tidy_yearly, x='year', y='value', color='variable',color_discrete_sequence=['red','blue'])fig.update_layout( title_text="Domestic Burglaries, Police Reported and Crime Survey", legend=dict( orientation="h", yanchor="bottom", y=-0.2),)# Set x-axis titlefig.update_xaxes(title_text="Year")# Set y-axes titlesfig.update_yaxes(title_text="Burglaries")fig.show()
Comparing Crime Survey to Police Recorded crime counts is generally considered bad form: while a police officer might always know the difference between a robbery, a burglary, and an affray to enable a theft, that’s not something most of the public worries about.
Thankfully, that shouldn’t be a problem for domestic burglary - most people know when they’ve been burgled! That means we can produce a ratio of crime survey burglaries to police burglaries, and obtain a rough estimate of what proportion of burglaries are reported to police over time.
Code
comparison_df['Proportion of Burglaries Reported'] = comparison_df['Police Reported'] / comparison_df['Crime Survey'] *100tidy_yearly = comparison_df[['year','Crime Survey','Police Reported','Proportion of Burglaries Reported']].melt(id_vars='year')# Create figure with secondary y-axisfig = make_subplots(specs=[[{"secondary_y": True}]])csew_burglary = tidy_yearly[tidy_yearly['variable'] =='Crime Survey']police_burglary = tidy_yearly[tidy_yearly['variable'] =='Police Reported']reported_ratio = tidy_yearly[tidy_yearly['variable'] =='Proportion of Burglaries Reported']fig.add_trace( go.Scatter(x=csew_burglary['year'], y=csew_burglary['value'], name="Crime Survey", line=dict(color='red')), secondary_y=False,)fig.add_trace( go.Scatter(x=police_burglary['year'], y=police_burglary['value'], name="Police Reported", line=dict(color='blue')), secondary_y=False,)fig.add_trace( go.Scatter(x=reported_ratio['year'], y=reported_ratio['value'], name="% Reported", line=dict(color='grey')), secondary_y=True,)# Add figure titlefig.update_layout( title_text="Crime Survey Burglaries per Year and Proportion Reported", legend=dict( orientation="h", yanchor="bottom", y=-0.2),)# Set x-axis titlefig.update_xaxes(title_text="Year")# Set y-axes titlesfig.update_yaxes(title_text="Burglaries", secondary_y=False, color='black')fig.update_yaxes(title_text="Proportion of Burglaries Reported", secondary_y=True, showgrid=False, color='grey')fig.show()
I’d once thought burglary reporting would be consistent - it’s an emotive crime, with a well established insurance industry - and yet we definitely see variation over time.
What does this tell us about burglary in England & Wales over the last decade though? Two things:
Domestic Burglaries are probably rarer than ever (and that’s not down to COVID)
That’s not consistently reflected in police recorded crime - the likelihood of reporting seems to change over time
How Many Burglars are we Catching?
So there aren’t many burglaries…so have we gotten better at catching the rest?
Well…we’re certainly not catching any more than we used to. Whether you use charges (eg, burglars being sent to court) or “detections” (eg, including cautions, fines and similar), there are far fewer, thought it looks like that trend started long before 2015.
Code
detections = pd.read_csv('../data/interim/burglary_detections.csv',index_col=0)detectionsfig = px.line(detections, x='year', y='detections')fig.update_xaxes(title_text="Year")# Set y-axes titlesfig.update_yaxes(title_text="Police Detections", range=[0,60000])fig.update_layout( title_text="Police Detections by Year")fig.show()
That’s not hugely surprising: there are fewer burglars around to be caught. What’s perhaps more important is how many burglars we’re catching as a proportion of those remaining burglaries we know about.
I’ve visualised that below: on the left axis, all police reported burglaries (in blue), and how many lead to a positive outcome (in red), and on the right axis, the ratio of the two: an estimate of the proportion of police burglaries that are “solved”.
Code
yearly_df = detections.merge(comparison_df, how='left', on='year')yearly_df['detected_csew_ratio'] = yearly_df['detections'] / yearly_df['Crime Survey'] *100yearly_df['detected_police_ratio'] = yearly_df['detections'] / yearly_df['Police Reported'] *100# Create figure with secondary y-axisfig = make_subplots(specs=[[{"secondary_y": True}]])fig.add_trace( go.Scatter(x=yearly_df['year'], y=yearly_df['Police Reported'], name="Police Reported", fillcolor='blue'), secondary_y=False,)fig.add_trace( go.Scatter(x=yearly_df['year'], y=yearly_df['detections'], name="Police Detections", fillcolor='red'), secondary_y=False,)fig.add_trace( go.Scatter(x=yearly_df['year'], y=yearly_df['detected_police_ratio'], name="Poportion of Police Reported Burglaries Detected", fillcolor='green', visible=True), secondary_y=True,)# Add figure titlefig.update_layout( title_text="Crime Survey Burglaries per Year and Proportion Reported", legend=dict( orientation="h", yanchor="bottom", y=-0.2),)# Set x-axis titlefig.update_xaxes(title_text="Year")# Set y-axes titlesfig.update_yaxes(title_text="Burglaries", secondary_y=False)fig.update_yaxes(title_text="Proportion of Burglaries Solved",range=[0,20], secondary_y=True, showgrid=False, color='green')fig.show()
It doesn’t look good: performance drops sharply around the time we stopped attending all burglaries in 2015.
Of course, plenty of other things happened in 2015: for instance, we know there was a radical shift in recording standards, where police recorded and invesigated thousands of burglaries which they would have previously not been aware of. So how do we measure how police performance changed irrespective of those recording changes? We can replicate that chart, but instead of using police recorded burglaries, we use all burglaries, as reported by the Crime Survey. rather than just police reported burglaries.
The below chart does exactly that - click on each of the buttons to see how the baseline figure affects perceived investigatory performance.
Code
# Create figure with secondary y-axisfig = make_subplots(specs=[[{"secondary_y": True}]])fig.add_trace( go.Scatter(x=yearly_df['year'], y=yearly_df['Police Reported'], name="Police Reported", fillcolor='blue'), secondary_y=False,)fig.add_trace( go.Scatter(x=yearly_df['year'], y=yearly_df['detections'], name="Police Detections", fillcolor='red'), secondary_y=False,)fig.add_trace( go.Scatter(x=yearly_df['year'], y=yearly_df['detected_police_ratio'], name="Poportion of Police Reported Burglaries Detected", fillcolor='green', visible=False), secondary_y=True,)fig.add_trace( go.Scatter(x=yearly_df['year'], y=yearly_df['detected_csew_ratio'], name="Poportion of Crime Survey Burglaries Detected", fillcolor='black', visible=False), secondary_y=True,)# Add figure titlefig.update_layout( title_text="Crime Survey Burglaries per Year and Proportion Reported")# Set x-axis titlefig.update_xaxes(title_text="Year")# Set y-axes titlesfig.update_yaxes(title_text="Burglaries", secondary_y=False)fig.update_yaxes(title_text="Proportion of Burglaries Solved",range=[0,20], secondary_y=True, showgrid=False, color='green')fig.update_layout( legend=dict( orientation="h", yanchor="bottom", y=-0.2), updatemenus=[dict(type="buttons", direction="right", active=0, x=1, y=1.2, bgcolor='grey', bordercolor='white', font=dict(color='white'), showactive=False, buttons=list([dict(label="As Proportion of Police Reported", method="update", args=[{"visible": [True, True, True,False]}]),dict(label="As Proportion of all Burglaries", method="update", args=[{"visible": [True, True, False,True]}]) ]), ) ])fig.show()
Suddenly that sharp drop in 2015 is a lot less convincing. Looking at all burglaries - rather than just those reported to police - the proportion being solved has been going down, but that started happening long before 2015. Around 5% of all burglaries in the crime survey were “solved” in 2010, but by 2015, that had already dropped to around 2.5%, compared to just over 2% in 2020.
What’s the lesson here? The perception of a recent, short term drop in police investigative performance are somewhat overstated. Yes, policing is catching fewer burglars than ever, but burglaries are rarer than ever - those that remain are less likely to be caught than they used to be, but that’s a trend that started long before 2015 (it might even have started as early as 2008).
What explains that trend? We really don’t know. It’s possible that those few remaining burglars are the most persistent and sophisiticated: maybe opportunistic drug users desperate for a fix who were willing to smash your front window in a decade ago have been replaced by professional teams who will do a whole set of flats in minutes. We can’t really tell with the data we’ve got here, but it’s probably not as simple as policing just not trying hard enough.
That’s different to mandatory attendance though: where even if the responding officer thinks there isn’t any point (for example, if the burglary was many months ago, or very unlikely to be solved following a phone report), they’ll attend anyway. So how have some reporters calculcated that policy could cut the number of offences by half?
Unhelpfully, the researchers for the Mail and the Telegraph haven’t been very open with sharing their methodology, but it looks like they’ve looked at those forces who already have a mandatory attendance policy, and tried to predict what would happen if that had been expanded nationally.
Three forces in particular are repeatedly mentioned as having introduced mandatory attendance prior to the NPCC mandate:
Bedfordshire Police using the codename “Operation Maze”, though again, it seems paired with dedicated burglary teams, and may in fact be attendance by forensics staff rather than police offiers. The earliest records are in early 2020 but it’s not clear at all this involves any mandatory attendance, rather than just increased resourcing.
The key question then is did the introduction of mandatory attendance in these three forces make any difference to burglary investigations?
To do that, we’ll calculate: - What was the detection rate for burglary in all forces? - How did the detection rate change after the introduction of mandatory attendance? - How did other, similar forces fare during the same period?
There are real limitations to doing this with public data: we are limited to quarter by quarter analysis, and we don’t actually know exactly when these policies started (or, for that matter, exactly what they include) or who else might have been doing something similar, but we can still try and identify what performance boost (if any) mandatory attendance made for these forces.
For each force, we then have a count of burglaries, and a ratio of how many solved burglaries (eg, burglary investigations where a burglary is identified and dealt with by police) we have that quarter.
Code
total_q_offences = burglary_df[['date','Force','Force outcomes for offences recorded in quarter']].groupby(['date','Force']).sum().reset_index()total_q_offences['total_offences'] = pd.to_numeric(total_q_offences['Force outcomes for offences recorded in quarter'],errors='coerce')positive_outcomes = ['Taken into consideration','Charged/Summonsed','Community Resolution','Cannabis/Khat Warning','Penalty Notices for Disorder','Caution – adults','Diversionary, educational or intervention activity, resulting from the crime report, has been undertaken and it is not in the public interest to take any further action.','Caution – youths']burglary_df['is_detected'] = burglary_df['Outcome Description'].isin(positive_outcomes)positive_burglary_df = burglary_df[burglary_df['is_detected']]total_detected_offences = positive_burglary_df[['date','Force','Force outcomes for offences recorded in quarter']].groupby(['date','Force']).sum().reset_index()total_detected_offences['total_detected'] = pd.to_numeric(total_detected_offences['Force outcomes for offences recorded in quarter'],errors='coerce')force_comparison = total_q_offences[['date','Force','total_offences']].merge(total_detected_offences, how='left',on=['date','Force']).drop(columns=['Force outcomes for offences recorded in quarter'])force_comparison['detected_rate'] = force_comparison['total_detected'] / force_comparison['total_offences']force_comparison['detected_rate'] = force_comparison['detected_rate'].fillna(0)force_comparison
date
Force
total_offences
total_detected
detected_rate
0
2017-01-01
Avon and Somerset
2151.0
112.0
0.052069
1
2017-01-01
Bedfordshire
1142.0
120.0
0.105079
2
2017-01-01
British Transport Police
0.0
0.0
0.000000
3
2017-01-01
Cambridgeshire
1109.0
99.0
0.089270
4
2017-01-01
Cheshire
790.0
80.0
0.101266
...
...
...
...
...
...
875
2021-10-01
Warwickshire
376.0
8.0
0.021277
876
2021-10-01
West Mercia
857.0
8.0
0.009335
877
2021-10-01
West Midlands
4061.0
93.0
0.022901
878
2021-10-01
West Yorkshire
2406.0
81.0
0.033666
879
2021-10-01
Wiltshire
265.0
8.0
0.030189
880 rows × 5 columns
Now, we identify our “treated” forces, and the period in which the policy was in place. This isn’t exactly clean, but based on the newspaper articles and press releases, I’ve picked out:
Bedfordshire from 2020 onwards
GMP from July 2021 onwards
Northamptonshire from April 2019 onwards
and of course, nationwide from October 2022
With those dates, we’ll then build our completed dataset of quarter by quarter burglary detection rates, for each force, and whether or not a mandatory attendance policy was in place in that force/quarter.
One limitation of working by quarters is that we’re quite limited in numbers! Good inference in statistical modelling relies on detecting “variance” - eg, having enough data to identify both our effect, and general background levels, and separating out the two. Given GMP has only been running for 6 quarters, we’ll focus on the other two forces for this analysis.
Code
force_comparison[['Force','mandatory_attendance']].groupby('Force').sum().sort_values(by='mandatory_attendance', ascending=False).rename(columns={"mandatory_attendance":'Quarters of Mandatory Attendance'})
Quarters of Mandatory Attendance
Force
Northamptonshire
11
Bedfordshire
8
Greater Manchester
6
South Wales
0
Merseyside
0
Metropolitan Police
0
Norfolk
0
North Wales
0
North Yorkshire
0
Northumbria
0
Nottinghamshire
0
Avon and Somerset
0
London, City of
0
Staffordshire
0
Suffolk
0
Surrey
0
Sussex
0
Thames Valley
0
Warwickshire
0
West Mercia
0
West Midlands
0
West Yorkshire
0
South Yorkshire
0
Lincolnshire
0
Leicestershire
0
Lancashire
0
British Transport Police
0
Cambridgeshire
0
Cheshire
0
Cleveland
0
Cumbria
0
Derbyshire
0
Devon and Cornwall
0
Dorset
0
Durham
0
Dyfed-Powys
0
Essex
0
Gloucestershire
0
Gwent
0
Hampshire
0
Hertfordshire
0
Humberside
0
Kent
0
Wiltshire
0
Now we can start combining our data together, and seeing if we can observe any meaningful difference after the intervention takes place.
For meaningful comparisons, we’ll seek to compare forces to other similar forces - there’s not much point in comparing GMP to Durham. We could do that with our existing data - eg, looking at forces with similar numbers of burglaries - but helpfully, the policing regulator HMIFRS already produces “Most Similar Forces” groups, which should let us match up forces based on size, demographics and other factors relevant to performance. We’ll seperate out our group A (Bedfordshire and their group) from group B (Northamptonshire’s), though notice Kent is in both - unhelpfully, HMIC’s groups aren’t exclusive, though that’s not too much of a problem.
Let’s start with a simple visual observation: we look at the rate of detection per outcome for each force and group, with the start of mandatory attendance visible as the dashed vertical line.
It looks, in a word, messy. There might be some sort of effect, but it certainly isn’t pronounced enough to visually observe. Instead, let’s use some statistical modelling to try and measure the average effect.
Statistical Modelling
Linear Models with Time Effects
With all our data, we now build a statistical model, predicting the detected outcome rate as a factor of force, their comparison group, their total offences, and the period in time.
Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. [2] The smallest eigenvalue is 8.11e-32. This might indicate that there are strong multicollinearity problems or that the design matrix is singular.
Our model suggests that mandatory attendance may have had, on average, a small effect on performance, this was not significant - eg, it may have been down to sheer luck.
Let’s go a bit further, and add a interaction effect between time and our force variable - eg, accounting for the fact that performance may vary differently over time force, by force.
Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. [2] The smallest eigenvalue is 2.52e-29. This might indicate that there are strong multicollinearity problems or that the design matrix is singular.
No matter how we model it, we’re again not finding meaningful effects - mandatory attendance does seem to be associated with slightly higher detected rates, on average, but this isn’t statistically significant. Put simply, examining the “statistical noise” in the rest of our model, we can’t confidently say any increases aren’t just due to random luck.
As one final approach, we’ll use panel data using the LinearModels library - far from the best way to examine how effects change over time, but not a bad starter for ten.
F-test for Poolability: 2.3035 P-value: 0.0000 Distribution: F(62,816)
Included effects: Entity, Time id: 0x1ecad429930
A similar pattern emerges - we see a small increase associated with mandatory attendance, but it’s non-significant - the increases are too small or too random to say the effect is associated due to attendance, rather than just random data noise.
That said, there are plenty of issues with our approach! We’ve only tried straightforward linear approaches, and given we’re relying on two forces and quarterly data, we’d struggle to detect small effects… hopefully we’ll notice something more stark when I come back to this in a few months for part 2, time series approaches!