At the most recent #MeasureCamp Carmen Mardiros did a presentation on monitoring ecommerce health. I didn't see this presentation because I was unable to be at #MeasureCamp but the blog post is interesting, worth a read, and hopefully gives a complete enough picture of what she was saying.
At #MeasureCamp number two I gave a presentation with some ideas for marketers to prioritise which areas to investigate by having some idea of what's happening. Combining some of my old ideas from that presentation with Carmen's more in depth thoughts has lead me to conclude that accurate forecasts are very important for web analysts looking to monitor and respond to changes.
We look (always along actionable dimensions) for things that are a bit odd or different; these provide the questions that lead to insights and improvements. But doing this requires an understanding of what is normal; what should this value be at this point in time?
Good forecasts help provide part of this answer.
But making a good forecast is not always that simple.
Here is a really simple forecast done with some real visit data (held in the variable "raw")
#load forecast library library(fpp) library(ggplot) #Convert visits data to a timeseries visits <- ts(raw$Visits, frequency=1) basicforecast <- forecast(visits) plot(basicforecast, xlim=c(800,850), xlab="Days", ylab="Visits", main="")
The blue line is the expected value, the dark grey area is an 80% confidence interval and the light grey area is a 95% confidence interval.
This forecast is pretty much useless as it doesn't even pick up the obvious weekly seasonality in the data.
To make this work better we can pick a more appropriate forecasting model and help it out by specifying that we are interested in weekly seasonality.
visits <- ts(raw$Visits, frequency=7) weekforecast <- forecast(visits) plot(weekforecast, xlim=c(110,125), xlab="Weeks", ylab="Visits", main="")
This is slightly more useful as the forecast takes into account weekly trends.
Similar methods could be used or more granular tracking; if you wanted to forecast for Monday morning rather than Monday as a whole then export the visits data by half day instead of by day and set the frequency to 14 (as there are 14 half days in a week).
One thing that I'm not sure how to do yet is combine different frequencies. Businesses can have a weekly, monthly and a yearly cycle and these must be combined for improved forecasting accuracy.
Also, as the frequency gets longer problems with limited data and less significant trends (who has a monthly trend as powerful as their weekend dip?) cause large uncertainty in the estimates.
visits <- ts(raw$Visits, frequency=30) monthforecast <- forecast(visits) plot(monthforecast, xlim=c(25,31), ylim=c(0,600), xlab="Months", ylab="Visits", main="")
Again, less useful then I hoped.