Finding the ROI of Title tag changes using Google's CausalImpact R package

After a conversation on Twitter about this new package, and mentioning it in my recent MeasureCamp presentation, here is a quick demo on using Google's CausalImpact applied to an SEO campaign.

CausalImpact is a package that looks to give some statistics behind changes you may have done in a marketing campaign.  It examines the time-series of data before and after an event, and gives you some idea on whether any changes were just down to random variation, or the event actually made a difference.

You can now test this yourself in my Shiny app that automatically pulls in your Google Analytics data so that you can apply CausalImpact to it.   This way you can A/B test changes for all your marketing channels, not just SEO.  However, if you want to try it manually yourself, keep reading.

Considerations before getting the data

Suffice to say, it should only be applied to time-series data (e.g. there is date or time on the x-axis), and it helps if the event was rolled out on only one of those time points.  This may influence the choice of time unit you use, so if say it rolled out over a week its probably better to use weekly data exports.  Also consider the time period you choose.  The package will use the time-series before the event to construct what it thinks should happen vs what actually happened, so if anything unusual or spikes occur in the test period it may affect your results.

Metrics wise the example here is with visits.  You could perhaps do it with conversions or revenue, but then you may get affected by factors outside of your control (the buy button breaking etc.), so for clean results try to take out as many confounding variables as possible. 

Example with SEO Titles

For me though, I had an example where some title tag changes went live on one day, so could compare the SEO traffic before and after to judge if it had any effect, and also more importantly judge how much extra traffic had increased.

I pulled in data with my go-to GA R import library, rga by Skardhamar.

Setup

I first setup, importing the libraries if you haven't got them and authenticating the GA account you want to pull data from.

Import GA data

I then pull in the data for the time period covering the event.  SEO Visits by date.

Apply CausalImpact

In this example, the title tags got updated on the 200th day of the time-period I pulled.  I want to examine what happened the next 44 days.

Plot the Results

With the plot() function you get output like this:

  1. The left vertical dotted line is where the estimate on what should have happened is calculated from.
  2. The right vertical dotted line is the event itself. (SEO title tag update)
  3. The original data you pulled is the top graph.
  4. The middle graph shows the estimated impact of the event per day.
  5. The bottom graph shows the estimated impact of the event overall.

In this example it can be seen that after 44 days there is an estimated 90,000 more SEO visits from the title tag changes. This then can be used to work out the ROI over time for that change.

Report the results

The $report method gives you a nice overview of the statistics in a verbose form, to help qualify your results.  Here is a sample output:

"During the post-intervention period, the response variable had an average value of approx. 94. By contrast, in the absence of an intervention, we would have expected an average response of 74. The 95% interval of this counterfactual prediction is [67, 81]. Subtracting this prediction from the observed response yields an estimate of the causal effect the intervention had on the response variable. This effect is 20 with a 95% interval of [14, 27]. For a discussion of the significance of this effect, see below.

Summing up the individual data points during the post-intervention period (which can only sometimes be meaningfully interpreted), the response variable had an overall value of 4.16K. By contrast, had the intervention not taken place, we would have expected a sum of 3.27K. The 95% interval of this prediction is [2.96K, 3.56K].

The above results are given in terms of absolute numbers. In relative terms, the response variable showed an increase of +27%. The 95% interval of this percentage is [+18%, +37%].

This means that the positive effect observed during the intervention period is statistically significant and unlikely to be due to random fluctuations. It should be noted, however, that the question of whether this increase also bears substantive significance can only be answered by comparing the absolute effect (20) to the original goal of the underlying intervention.

The probability of obtaining this effect by chance is very small (Bayesian tail-area probability p = 0.001). This means the causal effect can be considered statistically significant."

Next steps

This could then be repeated for things like UX changes, TV campaigns, etc. You just need the time of the event and the right metrics or KPIs to measure them against.

The above is just a brief intro, there is a lot more that can be done with the package including custom models etc, for more see the package help file and documentation.