Predicting Police Call Demand for Fun (and Prophet)

LAPD Call Prediction for Fun (and Prophet)

Time series prediction is more complicated than I originally anticipated when I tackled the subject during my thesis - while you can treat the events independently, like geographic data, everything is related: what happened yesterday will affect what happened today, and a Friday in July is not the same as a Monday in October.

There are various weird and wonderful algorithms to cope with these complexities, but Facebook's open source Prophet does a fantastic job of providing a "fire and forget" solution that just works.

This is the code extract from my Medium blog here.

In [2]:
import pandas as pd
from fbprophet import Prophet
from fbprophet.plot import plot_plotly
import plotly.offline as py
from fbprophet.diagnostics import cross_validation
from fbprophet.diagnostics import performance_metrics
from fbprophet.plot import plot_cross_validation_metric
#import chart_studio

Data Cleaning and Aggregating

We'll be using four years of LAPD call data, aggregated to hourly intervals. Prophet actually copes with various intervals quite well, so don't worry too much about how you do yours: just try and keep regular intervals, without too many missings bits.

In [3]:
df_by_hour = pd.read_csv("/content/total_calls_per_hour.csv")
In [4]:
df_by_hour["ds"] = pd.date_range(min(df_by_hour["ds"]), max(df_by_hour["ds"]), freq='H')

Your final cleaned data-set must contain the below two columns, ds and y. Everything else, Prophet will deal with.

In [5]:
df_by_hour.head()
Out[5]:
Dispatch Date_Dispatch Time call volume ds y
0 2015-01-01 00:00:00 286 2015-01-01 00:00:00 286
1 2015-01-01 01:00:00 265 2015-01-01 01:00:00 265
2 2015-01-01 02:00:00 179 2015-01-01 02:00:00 179
3 2015-01-01 03:00:00 152 2015-01-01 03:00:00 152
4 2015-01-01 04:00:00 127 2015-01-01 04:00:00 127
In [6]:
df_by_hour.tail()
Out[6]:
Dispatch Date_Dispatch Time call volume ds y
43819 2019-12-31 19:00:00 310 2019-12-31 19:00:00 310
43820 2019-12-31 20:00:00 320 2019-12-31 20:00:00 320
43821 2019-12-31 21:00:00 373 2019-12-31 21:00:00 373
43822 2019-12-31 22:00:00 354 2019-12-31 22:00:00 354
43823 2019-12-31 23:00:00 281 2019-12-31 23:00:00 281

Fitting and Deploying

Prophet works similar to most Python sklearn type implementations - just fit the data and you're off.

Helpfully, it will also make you a data-frame containing future dates for you to predict on. It will also provide a breakdown of seasonality trends.

In [7]:
m = Prophet()
m.fit(df_by_hour)
INFO:numexpr.utils:NumExpr defaulting to 2 threads.
Out[7]:
<fbprophet.forecaster.Prophet at 0x7fdbb146bdd8>
In [8]:
future = m.make_future_dataframe(periods=365)
forecast = m.predict(future)
In [9]:
fig1 = m.plot(forecast)
In [10]:
fig2 = m.plot_components(forecast)
In [13]:
fig = plot_plotly(m, forecast)  # This returns a plotly Figure
fig.show()