Prophet is open source project released by Facebook to forecast time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. Additive model is non-parametric regression model.
Installing prophet is very easy (if you are lucky!!!). If you are using Anaconda,
conda install gcc
conda install -c conda-forge fbprophet
It has a dependency on PyStan. I struggled for few hours to get the version things sorted out. But you will eventually get there if you encountered any issues.
I am not covering any theory. You can find how Prophet works at https://facebook.github.io/prophet/
For this analysis, I am using retail sales dataset.
import pandas as pd
from fbprophet import Prophet
above imports Prophet. Below loads the data into dataframe
df = pd.read_csv(‘/…./data/sales_data_set.csv’)
You get to see head of the dataframe
Not needed for the forecasting. But if you would like to find what are the stores and their occurrences,
Below is partial output
For this, I am taking Store 1 and Department 1. Best way to do this is to write a function.
df1=df.loc[df[‘Store’] == 1]
df2=df1.loc[df[‘Dept’] == 1]
Prophet needs just date and sales data. So, copy these two columns into a new dataframe, model expects the names to be ‘ds’ and ‘y’
df3 = df2[[‘Date’,’Weekly_Sales’]]
Now fit the model by instantiating a new Prophet object and then call its
fit method and pass in the historical dataframe
m = Prophet()
Now predict on a dataframe with a column
ds containing the dates for which a prediction is to be made. Here I am predicting for next 365 days.
future = m.make_future_dataframe(periods=52)
The predict method does future forecasting. Forecast is a new dataframe witch contains date, forecasted value, lower and upper value predictions, which is uncertainty intervals.
forecast = m.predict(future)
forecast[[‘ds’, ‘yhat’, ‘yhat_lower’, ‘yhat_upper’]].tail()
Now plot the data
fig1 = m.plot(forecast)
You can also plot the forecast components.
fig2 = m.plot_components(forecast)
Below is the interactive figure of the forecast can be created with plotly.
from fbprophet.plot import plot_plotly
import plotly.offline as py
fig = plot_plotly(m, forecast) # This returns a plotly Figure
Now let’s check how good is this model.
Here we do cross-validation to assess prediction performance on a horizon of 180 days, starting with 600 days of training data in the first cutoff and then making predictions every 90 days. This corresponds to 4 total forecasts.
from fbprophet.diagnostics import cross_validation
df_cv = cross_validation(m, initial=’600 days’, period=’90 days’, horizon = ‘180 days’)
The statistics computed are mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), mean absolute percent error (MAPE), and coverage of the
from fbprophet.diagnostics import performance_metrics
df_p = performance_metrics(df_cv)
Cross validation performance metrics can be visualized with
plot_cross_validation_metric, here shown for MAPE.
All the very best!!