Forecasting the weather in Berlin with ARIMA models

Project completed in week 5 (26.10.-30.10.20) of the Data Science Bootcamp at Spiced Academy in Berlin.

I found this week’s project quite challenging, because I haven’t worked with time series and forecasting before. But this means I had a lot of new things to learn!

Map it out nicely

I started this week’s challenge with the easy part: creating a choropeth map with the Folium library. The only trick was to find a JSON file with Berlin districts coordinates, then I managed to plot the average recorded temperature in Berlin Mitte.

It’s getting hot in here!

While further exploring the data, I found two interesting facts:

Since 1867, the hottest day in Berlin Mitte was July 11th 1984, with 30.8°C, and the coldest day was February 10th 1929, with -22.6°C.
There were missing values for 145 consecutive days, between April 25th and November 25th 1945. In the context of the Battle of Berlin, I guess there were bigger issues to take care of than recording the daily temperature. I filled in these missing values with the average temperature recorded in the same time frame in the previous year.

In any case, you can see a clear positive trend in the average temperature, proving that global warming is real.

Average daily temperature in Berlin-Mitte between 1867-2020, going from cold (blue) to hot (red)

I’ve seen the future (with a large estimation error)

The most common method for analyzing time series and making forecasts are AutoRegressive Integrated Moving Average (ARIMA) models. The name is quite self-explanatory, but I’ll break it down to make it more clear:

Auto-Regressive (AR): the future value depends only on the previous values.
Integrated (I): the data is stationary, which means it has:
- a constant mean (no trend)
- a constant standard deviation
- no seasonality
Moving Average (MA): the future value depends only on the lagged forecast errors.

Each of these three parts takes one parameter. Forr the implementation in Python it’s important to know how to find the optimal value for each of them:

p (AR): the number of lags of the predicted value (y) to be used a predictors. Check with the Partial Autocorrelation plot (PACF).
d (I): the number of differencing required to make the time series stationary. This is 0 if the series is already stationary, otherwise check with the Augmented Dickey-Fuller Test (ADF).
q (MA): the number of lagged forecast errors that should go into the model. Check with the Autocorrelation plot (ACF).

Or you can spare yourself all this work and implement an auto-ARIMA, which finds the best hyperparameters for you. (But, as with TPOT, the library used in my previous project, my advice is to be cautious with auto-ML and instead have fun learning how to tune the hyperparameters yourself.)

I tried both the manual and auto-ARIMA models on my data and eventually got a MAPE score of 0.22 (2.2%), which means that the model is 97.8% accurate in predicting the next 365 observations, and a MAE score of 6.2, which means that the model predicts the temperatures with an error of 6.2°C. (Fun fact: at first I got an infinite MAPE score – I learned that’s because of negative values, so I had to convert the temperature from Celsius to Kelvin.)

Friday Lightning Talk

After I had put in so many hours in my ARIMA models, of course I wanted to talk about that part of my project. But about one hour before the presentations, I discovered Facebook’s Prophet – “a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.” This sounded impressive, so I tried it out on my dataset and was surprised at how easy, quick, and well it performed – and how nice the plots looked with Plotly! So I ended up talking about this magic library that saves you up so much of the preprocessing work and yields at least equally good results as a strong ARIMA model.