Google API Client Library for R: googleAuthR v0.1.0 now available on CRAN

One of the problems with working with Google APIs is that quite often the hardest bit, authentication, comes right at the start.  This presents a big hurdle for those who want to work with them, it certainly delayed me.  In particular having Google authentication work with Shiny is problematic, as the token itself needs to be reactive and only applicable to the user who is authenticating.

But no longer! googleAuthR provides helper functions to make it easy to work with Google APIs.  And its now available on CRAN (my first CRAN package!) so you can install it easily by typing:

> install.packages("googleAuthR")

It should then load and you can get started by looking at the readme files on Github or typing:

> vignette("googleAuthR")

After my experiences making shinyga and searchConsoleR, I decided inventing the authentication wheel each time wasn't necessary, so worked on this new R package that smooths out this pain point.

googleAuthR provides easy authentication within R or in a Shiny app for Google APIs.  It provides a function factory you can use to generate your own functions, that call or do the actions you needed.

At last counting there are 83 APIs, many of which have no R library, so hopefully this library can help with that.  Examples include the Google Prediction API, YouTube analytics API, Gmail API etc. etc.

Example using googleAuthR

Here is an example of making a goo.gl R package using googleAuthR:

If you then want to make this multi-user in Shiny, then you just need to use the helper functions provided:





Enhance Your Google Analytics Data with R and Shiny (Free Online Dashboard Template)

Introduction

The aim of this post is to give you the tools to enhance your Google Analytics data with R and present it on-line using Shiny.  By following the steps below, you should have your own on-line GA dashboard, with these features:

  • Interactive trend graphs.

  • Auto-updating Google Analytics data.

  • Zoomable day-of-week heatmaps.

  • Top Level Trends via Year on Year, Month on Month and Last Month vs Month Last Year data modules.

  • A MySQL connection for data blending your own data with GA data.

  • An easy upload option to update a MySQL database.

  • Analysis of the impact of marketing events via Google's CausalImpact.

  • Detection of unusual time-points using Twitter's Anomaly Detection.

A lot of these features are either unavailable in the normal GA reports, or only possible in Google Analytics Premium.  Under the hood, the dashboard is exporting the data via the Google Analytics Reporting API, transforming it with various R statistical packages and then publishing it on-line via Shiny.

A live demo of the dashboard template is available on my Shinyapps.io account with dummy GA data, and all the code used is on Github here.

Feature Detail

Here are some details on what modules are within the dashboard.  A quick start guide on how to get the dashboard running with your own data is at the bottom.

Trend Graph

Most dashboards feature a trend plot, so you can quickly see how you are doing over time.  The dashboard uses dygraphs javascript library, which allows you to interact with the plot to zoom, pan and shift your date window.  Plot smoothing has been provided at the day, week, month and annual level.

Screen Shot 2015-07-17 at 221225png

Additionally, the events you upload via the MySQL upload also appear here, as well as any unusual time points detected as anomalies.  You can go into greater detail on these in the Analyse section.

Heatmap

Heatmaps use colour intensity to show metrics between categories.  The heatmap here is split into weeks and day per week, so you can quickly scan to see if a particular day of the week is popular - in the below plot, Monday/Tuesday look like they are best days for traffic.  

Screen Shot 2015-07-19 at 114241png

The data window is set by what you select in the trend graph, and you can zoom for more detail using the mouse.

Top Level Trends

Quite often headlines just need a number to quickly check.  These data modules give you a quick glance into how you are doing, comparing last week to the week before, last month to the month before and last month to the same month the year before.  Between them, you should see how your data is trending, accounting for seasonal variation.

Screen Shot 2015-07-19 at 114335png

MySQL Connection

The code provides functions to connect to a MySQL database, which you can use to blend your data with Google Analytics, provided you have a key to link them on.  

Screen Shot 2015-07-17 at 221124png

In the demo dashboard the key used is simply the date, but this can be expanded to include linking on a userID from say a CRM database to the Google Analytics CID, Transaction IDs to off-line sales data, or extra campaign information to your campaign IDs.  An interface is also provided to let end users update the database by uploading a text file.

CausalImpact

In the demo dashboard, the MySQL connection is used to upload Event data, which is then used to compare with the Google Analytics data to see if the event had a statistically significant impact on your traffic.  This replicates a lot of the functionality of the GA Effect dashboard.

Screen Shot 2015-07-17 at 221136png

Headline impact of the event is shown in the summary dashboard tab.  If its statistically significant, the impact is shown in blue.

Screen Shot 2015-07-19 at 160843png

Anomaly Detection

Twitter has released this R package to help detect unusual time points for use within their data streams, which is also handy for Google Analytics trend data.  

Screen Shot 2015-07-17 at 221151png

The annotations on the main trend plot are indicated using this package, and you can go into more detail and tweak the results in the Analyse section.

Making the dashboard multi-user

In this demo I’ve taken the usual use case of an internal department just looking to report on one Google Analytics property, but if you would like end users to authenticate with their own Google Analytics property, it can be combined with my shinyga() package, which provides functions which enable self authentication, similar to my GA Effect/Rollup/Meta apps.

In production, you can publish the dashboard behind a Shinyapps authentication login (needs a paid plan), or deploy your own Shiny Server to publish the dashboard on your company intranet.

Quick Start

Now you have seen the features, the below goes through the process for getting this dashboard for yourself. This guide assumes you know of R and Shiny - if you don’t then start there: http://shiny.rstudio.com/

You don’t need to have the MySQL details ready to see the app in action, it will just lack persistent storage.

Setup the files

  1. Clone/copy-paste the scripts in the github repository to your own RStudio project.

  2. Find your GA View ID you want to pull data from.  The quickest way to find it is to login to your Google Analytics account, go to the View then look at the URL: the number after “p” is the ID.

  3. [Optional] Get your MySQL setup with a user and IP address. See next section on how this is done using Google Cloud SQL.  You will also need to white-list the IP of where your app will sit, which will be your own Shiny Server or shinyapps.io. Add your local IP for testing too. If using shinyapps.io their IPs are: 54.204.29.251; 54.204.34.9; 54.204.36.75; 54.204.37.78.

  4. Create a file called secrets.R file in the same directory as the app with the below content filled in with your details.  

Configuring R

    1. Make sure you can install and run all the libraries needed by the app:

    2. Run the below command locally first, to store the auth token in the same folder.  You will be prompted to login with the Google account that has access to the GA View ID you put into step 3, and get a code to paste into the R console.  This will then be uploaded with app and handle the authentication with Google Analytics when in production:

        > rga::rga.open(where="token.rga")

    3. Test the app by hitting the “Run App” button at the top right of the server.ui script in RStudio, or by running:

        > shiny::runApp()

Using the dashboard

  1. The app should now be running locally in a browser window with your own GA data.  It can take up to 30 seconds for all the data to load first time.

  2. Deploy the instance on-line to Shinyapps.io with a free account there, or to your own Shiny Server instance.

  3. Customise your instance. If for any reason you don’t want certain features, then remove the feature in the ui.R script - the data is only called when the needed plot is viewed.

Getting a MySQL setup through Google Cloud SQL

If you want a MySQL database to use with the app, I use Google Cloud SQL.  Setup is simple:
  1. Go to the Google API console and create a project if you need to.

  2. Make sure you have billing turned on with your billing accounts menu top right.

  3. Go to Storage > Cloud SQL in the left hand menu.

  4. Create a New Instance.

  5. Create a new Database called “onlinegashiny”

  6. Under “Access Control” you need to put in the IP of yourself where you test it, as well as the IPs of the Shiny Server/shinyapps.io.  If you are using shinyapps.io the IPs are: 54.204.29.251; 54.204.34.9; 54.204.36.75;54.204.37.78

  7. Under “IP Address” create a static IP (Charged at $0.24 a day)

  8. You now should have all the access info you need to put in the apps secrets.R for MySQL access.  The port should be a default 3306

  9. You can also limit the amount of data that is uploaded by the shiny.maxRequestSize option - default is 0.5 MB.

Summary

Hopefully the above could help inspire what can be done with your Google Analytics data.  Focus has been on trying to give you the tools that allow action to be made on your data.

There is a lot more you can do via the thousands of R packages available, but hopefully this gives a framework you can build upon.

I’d love to see what you build with it, so do please feel free to get in touch. :)

How I made GA Effect - creating an online statistics dashboard using R

GA Effect is a webapp that uses Bayesian structural time-series to judge if events happening in your Google Analytics account are statistically significant.  Its been well received on Twitter and how to use it is detailed in this guest post on Online Behaviour, but this blog will be about how to build your own or similar.

Update 18th March: I've made a package that holds a lot of the functions below, shinyga.  That may be easiest to work with.

Screen Shot 2015-02-19 at 205931png

What R can do

Now is a golden time for the R community, as it gains popularity outside of its traditional academic background and hits business.  Microsoft has recently bought Revolution Analytics, an enterprise solution of R so we can expect a lot more integration with them soon, such as the machine learning in their Azure platform.

Meanwhile RStudio are releasing more and more packages that make it quicker and easier to create interactive graphics, with tools for connecting and reshaping data and then plotting using attractive JavaScript visualisation libraries or native interactive R plots.  GA Effect is also being hosted using ShinyApps.io, an R server solution that enables you to publish straight from your console, or you can run your own server using Shiny Server.  

Packages Used

For the GA Effect app, the key components were these R packages:

Putting them together

Web Interaction

First off, using RStudio makes this all a lot easier as they have a lot of integration with their products.

ShinyDashboard is a custom theme of the more general Shiny.  As detailed in the getting started guide, creating a blank webpage dashboard with shinydashboard take 8 lines of R code.  You can test or run everything locally first before publishing to the web via the “Publish” button at the top.  

Probably the most difficult concept to get around is the reactive programming functions in a Shiny app.  This is effectively how the interaction occurs, and sets up live relationships between inputs from your UX script (always called ui.R) and outputs from your server side scripts (called server.r).  These are your effective front-end and back-end in a traditional web environment.  The Shiny packages takes your R code and changes it into HTML5 and JavaScript. You can also import JavaScript of your own if you need it to cover what Shiny can’t.

The Shiny code then creates the UI for the app, and creates reactive versions of the datatables needed for the plots.

Google Authentication

The Google authentication flow uses OAuth2 and could be used for any Google API in the console, such as BigQuery, Gmail, Google Drive etc.  I include the code used for the authentication dance below so you can use it in your own apps:

Fetching Google Analytics Data

Once a user has authenticated with Google, the user token is then passed to rga() to fetch the GA data, according to which metric and segment the user has selected. 

This is done reactively, so each time you update the options a new data fetch to the API is made.  Shiny apps are on a per user basis and work in RAM, so the data is forgotten once the app closes down.

Doing the Statistics

You can now manipulate the data however you wish.  I put it through the CausalImpact package as that was the application goal, but you have a wealth of other R packages that could be used such as machine learning, text analysis, and all the other statistical packages available in the R universe.  It really is only limited by your imagination. 

Here is a link to the CausalImpact paper, if you really want to get in-depth with the methods used.  It includes some nice examples of predicting the impact of search campaign clicks.

Here is how CausalImpact was implemented as a function in GA Effect:

Plotting

dygraphs() is an R package that takes R input and outputs the JavaScript needed to display it in your browser, and as its made by RStudio they also made it compatible with Shiny.  It is an application of HTMLwidgets, which lets you take any JavaScript library and make it compatible with R code.  Here is an example of how the main result graph was generated:

Publishing

I’ve been testing the alpha of shinyapps.io for a year now, but it is just this month (Feb 2015) coming out of beta.  If you have an account, then publishing your app is as simple as pushing “Publish” button above your script, where it appears at a public URL.  With other paid plans, you can limit access to authenticated users only.

Next steps

This app only took me 3 days with my baby daughter on my lap during a sick weekend, so I’m sure you can come up with similar given time and experience.  The components are all there now to make some seriously great apps for analytics.  If you make something do please let me know!

E-mail open rate tracking with Google Analytics' Measurement Protocol - Demo

Edit 4th Feb 2015 - Google have published an email tracking guide with the Measurement Protocol.  The below goes a bit beyond that showing how to link the user sessions etc.

The Measurement Protocol was launched at the same time as Universal Analytics, but I've seen less adoption of it with clients, so this post is an attempt to show what can be done with it with a practical example.

The demo app is available here: http://ua-post-to-push.appspot.com/

With this demo you should be able to track the following:

  1. You have an email address from an interested customer
  2. You send them an email and they look at it, but don't click through.
  3. Three days later they open the email again at home, and click through to the offer on your website.
  4. They complete the form on the page and convert.

Within GA, you will be able to see for that campaign 2 opens, 1 click/visit and 1 conversion for that user.  As with all email open tracking, you are dependent on the user downloading the image, which is why I include the option to upload an image and not just a pixel, as it may be more enticing to allow images in your newsletter.

Intro

The Measurement Protocol lets you track beyond the website, without the need of client-side JavaScript.  You construct the URL and when that URL is loaded, you see the hit in your Google Analytics account.  That's it. 

The clever bit is that you can link user sessions together via the CID (Customer ID), so you can track the upcoming Internet of Things off-line to on-line, but also things like email opens and affiliate thank you pages.  It also works with things like enhanced e-commerce, so can be used for customer refunds or product impressions. 

This demo looks at e-mail opens for its example, but its minor modifications to track other things.  For instance, I use a similar script to measure in GA when my Raspberry Pi is backing up our home computers via Time Machine.

Demo on App Engine

To use the Measurement Protocol in production most likely needs server-side code.  I'm running a demo on Google App Engine coded in Python, which is pretty readable so should make it fairly easy for a developer to replicate in their favourite language.  App Engine is also a good choice if you are wanting to run it in production, since it has a free tier for tracking 1000s of email opens a day, but scalability to handle millions.

This code is available on Github here - http://github.com/MarkEdmondson1234/ga-get-to-post

App running that code is here: http://ua-post-to-push.appspot.com/

There are instructions on Github on how it works, but I'll run through some of the key concepts here in this post.

What the code does

The example has four main URLs:
  • The homepage explaining the app
  • The image URL itself, that when loaded creates the hit to GA
  • A landing page with example custom GA tracking script
  • An upload image form to change the image you would display in the e-mail.
The URLs above are controlled server side with the code in main.py

Homepage

This does nothing server side aside serve up the page



Image URL


This is the main point of the app - it turns a GET request for the image uploaded into a POST with the parameters found in the URL.  It handles the different options and sends the hit to GA as a virtual pageview or event, with a unique user CID and campaign name. An example URL here is:
http://your-appengine-id.appspot.com/main.png?cid=blah&p=1&c=email_campaign



Landing Page


This does little but take the cid you put in the email URL, and outputs the CID that will be used in Google Analytics.  If this is the same CID as in the image URL and the user clicks in the email, those sessions will be linked. You can also add the GA campaign parameters, but the sever side script ignores those - the javascript on the page will take care of it. An example URL here is:
http://your-appengine-id.appspot.com/landing-page?cid=blah&utm_source=source_me&utm_medium=medium_me&utm_campaign=campaign_me


The CID in the landing page URL is then captured and turned into an anonymous CID for GA.  This is then served up to the Universal Analytics JavaScript on the landing page, shown below.  Use the same UA code for both, else it won't work (e.g. UA-123456-1)


Upload Image

This just handles the image uploading and serves the image up via App Engines blobstore.  Nothing pertinent to GA here so see the Github code if interested.

Summary

Its hoped this helps sell using the Measurement Protocol to more developers, as it offers a solution to a lot of the problems with digital measurement today, such as attribution of users beyond the website.  The implementation is reasonably simple, but the power is in what you send and what situations.  Hopefully this inspires what you could do with your setup.

There are some limitations to be aware of - the CID linking won't stitch sessions together, it just discards a user's old CID if they already had one, so you may want to look at userID or how to customise the CID for users who visit your website first before the email is sent.  The best scenario would be if a user is logged in for every session, but this may not be practical.  It may be that the value of linking sessions is so advantageous in the future, entire website strategies will be focused on getting users to ID themselves, such as via social logins.

Always consider privacy: look for user's to opt in, and make sure to use GA filters to take out any PII you may put into GA as a result.  Current policy looks to be that if the data within GA is not able to be tracked to an individual (e.g. a name, address or email) then you are able to record an anonymous personal ID, that could be exported and linked to PII outside of GA.  This is a bit of a shifting target, but in all cases keeping it as user focused and not profit focused as possible should see you through any ethical questions.






My Google Webmaster Tools Downloader app

Here is a tool that I have used for SEO analytics, that I am now making publicly available. It extends Google Webmaster Tools to help answer common SEO questions more easily.

Visit the Google Webmaster Tools Downloader

Here are a few example questions it helps answer:

  • SEO keyword rankings taking into account personalisation and localisation for Google, in this age of (not provided)
  • SEO keyword performance beyond the 90 days available by default e.g. year on year comparisons
  • How a segment of keywords have performed over time e.g. brand vs non-brand
  • How click through rates change over time e.g. after a website migration.
  • How new/old website sections perform in Google search via the Top Pages reports

These things were a lot easier before (not provided) took keywords out of web analytics.  This left Google Webmaster Tools as the only reliable source of rankings, but it was not an ideal replacement, with limitations that needed to be worked around by downloading data via an API - an API that rarely gets updated.

I'm aware this app could quickly become obsolete if Google updated GWT, but it has also served as a great project for me to get to know working with App Engine, jinja2 templating, Google Charts, caching, Stripe, Bootstrap, etc. so its all been worthwhile - I think I can safely say its been the most educational project I've done, and can serve as another template for more sophisticated APIs (the Google Tag Manager API is in sights)

Its also my first app that will be charged for, simply because keeping a daily breakdown of keywords in a database carries a cost, which is probably why Google don't offer it for free at the moment. There are other web apps on the market that do downloads for free, but I am wary of those by the adage "if you don't pay for a service, you pay with your data".

I plan to follow it up with more deeper features, including Tableau examples of what you can do with this data once you have it at such a deep level. 

For now, if you want to sign up to test the alpha, please check out the signup page here

Run R, RStudio and OpenCPU on Google Compute Engine [free VM image]

File this under "what I wished was on the web whilst trying to do this myself."

edit 20th November, 2016 - now everything in this post is abstracted away and available in the googleComputeEngineR package - I would say its a lot easier to use that.  Here is a post on getting started with it. http://code.markedmondson.me/launch-rstudio-server-google-cloud-in-two-lines-r/

edit 30th April, 2016: I now have a new post up on how to install RStudio Server on Google Compute Engine using Docker, which is a better way to do it. 

edit 30th Nov, 2015: Oscar explains why some users couldn't use their username

edit 5th October: Added how to login, add users and migrated from gcutil to gcloud

Google Compute Engine is a very scalable and quick alternative to Amazon Web Services, but a bit less evolved in the images available for users. 

If you would like to have a VM with R 3.01, RStudio Server 0.98 and OpenCPU installed, then you can click on the link below, and install a pre-configured version for you to build upon.

With this image, you have a cloud server with the most popular R / Cloud interfaces available, which you can use to apply statistics, machine learning or other R applications on web APIs.  It is a fundamental building block for a lot of my projects.

The VM image is here. [940.39MB]

To use, follow these steps:

Downloading the instance and uploading to your project

  1. Create your own Google Cloud Compute project if you haven't one already.
  2. Put in billing details.  Here are the prices you'll pay for running the machine. Its usually under $10 a month.
  3. Download the image from the link above (and here) and then upload it to your own project's Cloud Storage. Details here
  4. Add the uploaded image to your project with a nice name that is only lowercase, numbers or includes hyphens (-).  Details here. You can do this using gcloud and typing: 
$ gcloud compute images create IMAGE_NAME --source-uri URI

Creating the new Instance

  1. Now go to Google Compute Engine, and select Create New Instance
  2. Select the zone, machine type you want (i.e. you can select a 50GB RAM machine if needed for big jobs temporarily)
  3. In the dropdown for images you should be able to see the image from step 4 above.  Here is a screenshot of how it should look, I called my image "r-studio-opencpu20140628"

Or, if you prefer using command line, you can do the steps above in one command with gcloud like this:

$ gcloud compute instances create INSTANCE [INSTANCE ...] --image IMAGE

Using your instance

You should now have RStudio running on http://your-ip-address/rstudio/ and openCPU running on http://your-ip-address/ocpu/test and a welcome homepage running at the root http://your-ip-address

To login, your Google username is an admin as you created the Google cloud project. See here for adding users to Google Cloud projects

If you don't know your username, try this command using gcloud to see your user details:

$ gcloud auth login

Any users you add to Debian running on the instance will have a user in RStudio - to log into Debian and add new users, see below:

$ ## ssh into the running instance
$ gcloud compute ssh <your-username>@new-instance-name
$ #### It should now tell you that you are logged into your instance #####
$ #### Once logged in, add a user: example with jsmith
$ sudo useradd jsmith
$ sudo passwd jsmith
$ ## give the new user a directory and change ownership to them
$ sudo mkdir /home/jsmith $ sudo chown jsmith:users /home/jsmith

Oscar in the comments below also explains why sometimes your username may not work:

Like other comments, my username did not work.

Rather than creating a new user, you may need to simply add a password to your user account:

$ sudo passwd .

Also, the username will be your email address with the '.' replaced with '_'. So xx.yy@gmail.com became xx_yy

You may also want to remove my default user the image comes with:

$ sudo userdel markedmondson

...and remove my folder:

$ sudo rm -rf /home/markedmondson

The configuration used

If you would like to look before you leap, or prefer to install this yourself, a recipe is below. It largely cobbles together the instructions around the web supplied by these sources:

Many thanks to them.

It covers installation on the Debian Wheezy images available on GCE, with the necessary backports:








How To Use R to Analyse and Plot Your Twitter Use

Here is a little how-to if you want to use R to analyse Twitter.  This is the first of two posts: this one talks about the How, the second will talk about the Why.  

If you follow all the code you should be able to produce plots like this:

As with all analytic projects its split into four different aspects: 1. getting the data; 2. transformations; 3. analysing; 4. plotting.

All the code is available on my first public github project:

https://github.com/MarkEdmondson1234/r-twitter-api-ggplot2

I did this project to help answer an idea: can I tell by my Twitter when I changed jobs or moved country?

I have the feeling the more I am doing SEO, the more I rely on Twitter as an information source; whereas for Analytics its more independent research that takes place more on StackOverflow and Github. Hopefully this project can see if this is valid.

1. Getting the data

R makes getting tweets easy via the twitteR package.  You need to install that, register your app with Twitter, then authenticate to get access to the Twitter API.

Another alternative to using the API is to use Twitter's data export, which will then let you go beyond the 3200 limit in the API. This gives you a csv which you can load into R using read.csv()

2. Transforming the data

For my purposes, I needed to read the timestamps of the tweets, and put them into early, morning, afternoon and evening buckets, so I could then plot the data.  I also created a few aggregates of the data, to suit what I needed to plot, and these dataframes I outputted from my function in a list.

Again, as with most analytics projects, this section represents most of the work, with to and fro happening as I tweaked the data I wanted in the chart.  Some tip I've picked up is to try and do these data transformations in a function taking the raw data as an input and outputting your processed data, as it makes it easier to repeat for different data inputs.

3. Analysing the data

This will be covered in the second post, and usually is the point of the whole exercise - it only takes about 10% of time on the project, but is the most important.

4. Plotting the data

This part evolves as you go to and fro from steps 2-3, but what I ended up with where these functions below.

theme_mark() is a custom ggplot2 theme you can use if you want the plots to look exactly the same as above, or at the very least show how to customise ggplot2 to your own fonts/colours.  It also uses choosePalette() and installFonts(). "mrMustard" is my name for the colour scheme chosen.

I use two layers in the plot - one is the area plot to show the total time spent per Day Part, the second is a smoother line to help pick out the trend better for each Day Part.

plotTweetsDP() takes as input the tweetTD (weekly) or tweetTDm (monthly) dataframes, and plots the daypart dataframe produced by the transformations above.  The timeAxis paramter expects "ym" (yearWeek) or "ym" (yearMonth) which it uses to make the x-axis be more suited to each.

plotLinksTweets() is the same, but works on the tweetLinks dataframe.


I hope this is of some use to someone, let me know in the comments!  Also any ideas on where to go from here - at the moment I'm working through some text mining packages to try and get something useful out of those. 

Again the full project code is available on Github here: https://github.com/MarkEdmondson1234/r-twitter-api-ggplot2

My Google Analytics Time Series Shiny App (Alpha)

There are many Google Analytics dashboards like it, but this one is mine:

My Google Analytics Time Series App

Its a bare bones framework where I can start to publish publicly some of the R work I have been learning over the past couple of years. 

It takes advantage of an Alpha of Shinyapps, which is a public offering of R Shiny, that I love and adore. 

At the moment the app has just been made to authenticate and show some generic output, but I plan to create a lot more interesting plots/graphs from it in the future.

How To Use It

  1. You need a Google Analytics account.  
  2. Go to https://mark.shinyapps.io/GA_timeseries/
  3. You'll see this screen.  Pardon the over heavy legal disclaimers, I'm just covering my arse.  I have no intention of using this app to mine data, but other's GA apps might, so I would be wary giving access to Google Analytics for other webapps, especially now its possible to add users via the management API.
  4. Click the "GA Authentication" link.  It'll take you to the Google account screen, where you say its ok to use the data (if it is), and copy-paste the token it then displays.
  5. This token allows the app (but not me) process your data.  Go back to the app and paste the token in the box.
  6. Wait about 10 seconds, depending on how many accounts you have in your Google Analytics.
  7. Sometimes you may see "Bad Request" which means the app is bad, and the GA call has errored.  If you hard reload the page (on Firefox this is SHIFT + RELOAD), you need to reauthenticate starting from step 2 above. Sorry.
  8. You should now see a table of your GA Views on the "GA View Table" tab.  You can search and browse the table, and choose the account and profile ID you want to work with via the left hand drop downs. Example using Sanne's Copenhagenish blog:
  9. If you click on "Charts" tab in the middle, you should see some Google Charts of your Visits and PageViews. Just place holders for now.
  10. If you click on the "Forecasts" tab you should see some forecasting of your visits data.  If it doesn't show, make sure the date range to the far left covers 70 days (say 1st Dec 2013 to 20th Feb 2014). 
  11. The Forecast is based on Holt-Winters exponential smoothing to try and model seasonality.  The red line is your actual data, the blue the model's guess including 70 days into the future. The green area is the margin of error to 50% confidence, and the Time axis shows number of months.  To be improved.
  12. Under the forecast model is a decomposition of the visits time series. Top graph is the actual data, second is the trend without seasonal, third graph the 31 data seasonal trend and the forth graph is the random everything else.
  13. In the last "Data Table" tab you can see the top 1000 rows of data.

That's it for now, but I'll be doing more in the future with some more exciting uses of GA data, including clustering, unsupervised learning, multinomial regression and sexy stuff like that.

Update 24th Feb

I've now added a bit of segmentation, with SEO and Referral data available trended, forecasted and decomposed.