# Retail sales forecast using Facebook’s Prophet

Prophet is open source project released by Facebook to forecast time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. Additive model is non-parametric regression model.

*Install Prophet:*

Installing prophet is very easy (if you are lucky!!!). If you are using Anaconda,

conda install gcc

conda install -c conda-forge fbprophet

It has a dependency on PyStan. I struggled for few hours to get the version things sorted out. But you will eventually get there if you encountered any issues.

I am not covering any theory. You can find how Prophet works at https://facebook.github.io/prophet/

For this analysis, I am using retail sales dataset.

import pandas as pd

from fbprophet import Prophet

above imports Prophet. Below loads the data into dataframe

df = pd.read_csv(‘/…./data/sales_data_set.csv’)

df.head()

You get to see head of the dataframe

Not needed for the forecasting. But if you would like to find what are the stores and their occurrences,

df[‘Store’].value_counts()

Below is partial output

For this, I am taking Store 1 and Department 1. Best way to do this is to write a function.

df1=df.loc[df[‘Store’] == 1]

df2=df1.loc[df[‘Dept’] == 1]df2.head()

Prophet needs just date and sales data. So, copy these two columns into a new dataframe, model expects the names to be ‘ds’ and ‘y’

df3 = df2[[‘Date’,’Weekly_Sales’]]

df3.rename(columns={‘Date’:’ds’}, inplace=True)

df3.rename(columns={‘Weekly_Sales’:’y’}, inplace=True)df3.head()

Now fit the model by instantiating a new Prophet object and then call its `fit`

method and pass in the historical dataframe

m = Prophet()

m.fit(df3)

Now predict on a dataframe with a column `ds`

containing the dates for which a prediction is to be made. Here I am predicting for next 365 days.

future = m.make_future_dataframe(periods=52)

future.tail()

The predict method does future forecasting. Forecast is a new dataframe witch contains date, forecasted value, lower and upper value predictions, which is uncertainty intervals.

forecast = m.predict(future)

forecast[[‘ds’, ‘yhat’, ‘yhat_lower’, ‘yhat_upper’]].tail()

Now plot the data

fig1 = m.plot(forecast)

You can also plot the forecast components.

fig2 = m.plot_components(forecast)

Below is the interactive figure of the forecast can be created with plotly.

from fbprophet.plot import plot_plotly

import plotly.offline as py

py.init_notebook_mode()fig = plot_plotly(m, forecast) # This returns a plotly Figure

py.iplot(fig)

Now let’s check how good is this model.

Here we do cross-validation to assess prediction performance on a horizon of 180 days, starting with 600 days of training data in the first cutoff and then making predictions every 90 days. This corresponds to 4 total forecasts.

from fbprophet.diagnostics import cross_validation

df_cv = cross_validation(m, initial=’600 days’, period=’90 days’, horizon = ‘180 days’)

df_cv.head()

The statistics computed are mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), mean absolute percent error (MAPE), and coverage of the `yhat_lower`

and `yhat_upper`

estimates.

from fbprophet.diagnostics import performance_metrics

df_p = performance_metrics(df_cv)

df_p.head()

Cross validation performance metrics can be visualized with `plot_cross_validation_metric`

, here shown for MAPE.

All the very best!!