Enhance Your Google Analytics Data with R and Shiny (Free Online Dashboard Template)

Introduction

The aim of this post is to give you the tools to enhance your Google Analytics data with R and present it on-line using Shiny.  By following the steps below, you should have your own on-line GA dashboard, with these features:

  • Interactive trend graphs.

  • Auto-updating Google Analytics data.

  • Zoomable day-of-week heatmaps.

  • Top Level Trends via Year on Year, Month on Month and Last Month vs Month Last Year data modules.

  • A MySQL connection for data blending your own data with GA data.

  • An easy upload option to update a MySQL database.

  • Analysis of the impact of marketing events via Google's CausalImpact.

  • Detection of unusual time-points using Twitter's Anomaly Detection.

A lot of these features are either unavailable in the normal GA reports, or only possible in Google Analytics Premium.  Under the hood, the dashboard is exporting the data via the Google Analytics Reporting API, transforming it with various R statistical packages and then publishing it on-line via Shiny.

A live demo of the dashboard template is available on my Shinyapps.io account with dummy GA data, and all the code used is on Github here.

Feature Detail

Here are some details on what modules are within the dashboard.  A quick start guide on how to get the dashboard running with your own data is at the bottom.

Trend Graph

Most dashboards feature a trend plot, so you can quickly see how you are doing over time.  The dashboard uses dygraphs javascript library, which allows you to interact with the plot to zoom, pan and shift your date window.  Plot smoothing has been provided at the day, week, month and annual level.

Screen Shot 2015-07-17 at 221225png

Additionally, the events you upload via the MySQL upload also appear here, as well as any unusual time points detected as anomalies.  You can go into greater detail on these in the Analyse section.

Heatmap

Heatmaps use colour intensity to show metrics between categories.  The heatmap here is split into weeks and day per week, so you can quickly scan to see if a particular day of the week is popular - in the below plot, Monday/Tuesday look like they are best days for traffic.  

Screen Shot 2015-07-19 at 114241png

The data window is set by what you select in the trend graph, and you can zoom for more detail using the mouse.

Top Level Trends

Quite often headlines just need a number to quickly check.  These data modules give you a quick glance into how you are doing, comparing last week to the week before, last month to the month before and last month to the same month the year before.  Between them, you should see how your data is trending, accounting for seasonal variation.

Screen Shot 2015-07-19 at 114335png

MySQL Connection

The code provides functions to connect to a MySQL database, which you can use to blend your data with Google Analytics, provided you have a key to link them on.  

Screen Shot 2015-07-17 at 221124png

In the demo dashboard the key used is simply the date, but this can be expanded to include linking on a userID from say a CRM database to the Google Analytics CID, Transaction IDs to off-line sales data, or extra campaign information to your campaign IDs.  An interface is also provided to let end users update the database by uploading a text file.

CausalImpact

In the demo dashboard, the MySQL connection is used to upload Event data, which is then used to compare with the Google Analytics data to see if the event had a statistically significant impact on your traffic.  This replicates a lot of the functionality of the GA Effect dashboard.

Screen Shot 2015-07-17 at 221136png

Headline impact of the event is shown in the summary dashboard tab.  If its statistically significant, the impact is shown in blue.

Screen Shot 2015-07-19 at 160843png

Anomaly Detection

Twitter has released this R package to help detect unusual time points for use within their data streams, which is also handy for Google Analytics trend data.  

Screen Shot 2015-07-17 at 221151png

The annotations on the main trend plot are indicated using this package, and you can go into more detail and tweak the results in the Analyse section.

Making the dashboard multi-user

In this demo I’ve taken the usual use case of an internal department just looking to report on one Google Analytics property, but if you would like end users to authenticate with their own Google Analytics property, it can be combined with my shinyga() package, which provides functions which enable self authentication, similar to my GA Effect/Rollup/Meta apps.

In production, you can publish the dashboard behind a Shinyapps authentication login (needs a paid plan), or deploy your own Shiny Server to publish the dashboard on your company intranet.

Quick Start

Now you have seen the features, the below goes through the process for getting this dashboard for yourself. This guide assumes you know of R and Shiny - if you don’t then start there: http://shiny.rstudio.com/

You don’t need to have the MySQL details ready to see the app in action, it will just lack persistent storage.

Setup the files

  1. Clone/copy-paste the scripts in the github repository to your own RStudio project.

  2. Find your GA View ID you want to pull data from.  The quickest way to find it is to login to your Google Analytics account, go to the View then look at the URL: the number after “p” is the ID.

  3. [Optional] Get your MySQL setup with a user and IP address. See next section on how this is done using Google Cloud SQL.  You will also need to white-list the IP of where your app will sit, which will be your own Shiny Server or shinyapps.io. Add your local IP for testing too. If using shinyapps.io their IPs are: 54.204.29.251; 54.204.34.9; 54.204.36.75; 54.204.37.78.

  4. Create a file called secrets.R file in the same directory as the app with the below content filled in with your details.  

Configuring R

    1. Make sure you can install and run all the libraries needed by the app:

    2. Run the below command locally first, to store the auth token in the same folder.  You will be prompted to login with the Google account that has access to the GA View ID you put into step 3, and get a code to paste into the R console.  This will then be uploaded with app and handle the authentication with Google Analytics when in production:

        > rga::rga.open(where="token.rga")

    3. Test the app by hitting the “Run App” button at the top right of the server.ui script in RStudio, or by running:

        > shiny::runApp()

Using the dashboard

  1. The app should now be running locally in a browser window with your own GA data.  It can take up to 30 seconds for all the data to load first time.

  2. Deploy the instance on-line to Shinyapps.io with a free account there, or to your own Shiny Server instance.

  3. Customise your instance. If for any reason you don’t want certain features, then remove the feature in the ui.R script - the data is only called when the needed plot is viewed.

Getting a MySQL setup through Google Cloud SQL

If you want a MySQL database to use with the app, I use Google Cloud SQL.  Setup is simple:
  1. Go to the Google API console and create a project if you need to.

  2. Make sure you have billing turned on with your billing accounts menu top right.

  3. Go to Storage > Cloud SQL in the left hand menu.

  4. Create a New Instance.

  5. Create a new Database called “onlinegashiny”

  6. Under “Access Control” you need to put in the IP of yourself where you test it, as well as the IPs of the Shiny Server/shinyapps.io.  If you are using shinyapps.io the IPs are: 54.204.29.251; 54.204.34.9; 54.204.36.75;54.204.37.78

  7. Under “IP Address” create a static IP (Charged at $0.24 a day)

  8. You now should have all the access info you need to put in the apps secrets.R for MySQL access.  The port should be a default 3306

  9. You can also limit the amount of data that is uploaded by the shiny.maxRequestSize option - default is 0.5 MB.

Summary

Hopefully the above could help inspire what can be done with your Google Analytics data.  Focus has been on trying to give you the tools that allow action to be made on your data.

There is a lot more you can do via the thousands of R packages available, but hopefully this gives a framework you can build upon.

I’d love to see what you build with it, so do please feel free to get in touch. :)
15 responses
Thank you! This is awesome!
Thanks Patrick, let me know what you build from it :)
Hey Mark, I'm a big fan of your Blog. I would like to implement some of your fancy shiny apps. Do you see any drawbacks of using library(RGA) instead of library(rga) for that shiny-GA setup? Thanx!
Dear Eugene, thanks very much! Its probably better to port over to RGA over rga these days as rga isn't maintained so much by Bror who created it. However you can see a list of the GA -> R libraries I know of, and even brave trying my own R library, googleAnalyticsR, if you follow this link: https://github.com/MarkEdmondson1234/googleAnal...
...and if its Shiny apps you want to make that ask for a user to login to see their own data via OAuth2, I would recommend my library over the others, as the authentication tokens need its setup. If its just your own data, then the others are more fully featured.
Hi Mark! I tried to put in some data from a CSV. However I always get some error in "strptime" function. Message: "String was too long" (my translation from german message) This is the content of the CSV date,eventname 2015-11-05,Michael Sinner: Adblocker im GTM erfassen 2015-10-02,Michael Sinner: Apptracking und Debugging ausw?hlen 2015-08-24,Michael Sinner: Google Analytics Realtime API ? Live Dashboard in Google Spreadsheets ausw?hlen The error happens in this code line: dates_guessed
Hi Mischa, I think there must be a bad date format in your file somewhere, is that all the content you list there or is there more? Try uploading a subset until you see the problem line. Otherwise, perhaps a missing comma(,) is messing up the upload in general. Perhaps you are using the european standard of semi-colon instead as the seperator value? (;) A good fail safe may be to load the file you want in a local version of RStudio, then export that using write.csv() and upload that file instead, as that should take care of formatting etc. If you are exporting a csv from Ecel for example it can get messy quickly. Mark
Somehow I get the error message: replacement has 36 rows, data has 18 if I try to upload the marketing events. The connetction to SQL works. Do you have an idea why this happens?
Hi Patrick - I guess your data may have duplicate dates? Hard to say though, with your local copy put a break point or write to a file so you can see where its happening.
Hey Mark, thanks for sharing the code for that cool dashboard! I just wanted to know the meaning of: scriptHeaders
Particularly its meaning in ui.R in fluidRow. (somehow my initial question was truncated :-) )
Hi Eugene, thats for any external scripts you want to put in such as tracking scripts - here is a reference http://stackoverflow.com/questions/23599268/inc...
3 visitors upvoted this post.