tag:markedmondson.me,2013:/posts Mark Edmondson 2017-10-15T19:36:05Z Mark Edmondson tag:markedmondson.me,2013:Post/942666 2015-12-03T12:10:26Z 2016-07-04T10:34:37Z BigQuery Visualiser Shiny app now free and open sourced

A few weeks ago I tweeted a beta version of a BigQuery Visualiser Shiny app that was well received, and got some valuable feedback on how it could be improved, in particular from @felipehoffa - thanks Felipe!

Here is a screenshot of the app:

Motivation

The idea of the app is to enhance the standard BigQuery interface to include plots of the data you query.  It uses ggplot2, a popular R library; d3heatmaps, a d3 JavaScript library to display heatmaps; and timelyportfolio's listviewer, a nice library for viewing all the BigQuery meta data in a collapsible tree.   Other visualisations can be added fairly easily and will be done so over time, but if you have a request for something in particular you can raise an issue on the project's Github page.

I got into BigQuery once it started to receive exports from Google Analytics Premium. Since these exports carry unsampled raw data and include unique userIds, its a richer data source for analysis than the Google Analytics reporting API.

It also was a chance to create another Google API library called bigQueryR, the newest member of the googleAuthR family.  Using googleAuthR meant Shiny support, and also meant bigQueryR can be used alongside googleAnalyticsR and searchConsoleR under one shared login flow.  This is something exploited in this demo of RMarkdown, which pulls data from all three sources into a scheduled report.

Running your own BigQuery Visualiser

All set-up instructions are listed on the BigQuery Visualiser's Github project page

You can run the Shiny app locally on your computer within RStudio; within your own company intranet if its running Shiny Server; or publicly like the original app on shinyapps.io

Feedback

Please let me know what else could improve. 

I have a current pending issue on using JSON uploads for authentication that is waiting a bug update in httr, the underlying library.

In particular all the htmlwidgets() packages could be added - this wonderful R library creates an R to d3.js interface, which holds some of the nicest visualisations on the web.

In this first release, I favoured plots that could apply to as much different data sets as possible.  For your own use cases you can be more restrictive on what data is requested, and so maybe more ambitious in the plots.  If you want inspiration timelyportfolio (he who wrote the listviewer library) has a blog where he makes lots of htmlwidgets libraries.

Enjoy!  Hope its of use, let me know if you build something cool with it.

]]>
Mark Edmondson
tag:markedmondson.me,2013:Post/907644 2015-09-22T12:04:38Z 2017-10-15T19:36:05Z Introduction to Machine Learning with Web Analytics: Random Forests and K-Means

MeasureCamp #7

I've just come back from #MeasureCamp, where I attended some great talks: on hierarchical models; the process of analysis; a demo of Hadoop processing Adobe Analytics hits; web scraping with Python and how machine learning will affect marketing in the future.  Unfortunately the sad part of MeasureCamp is you also miss some excellent content when they clash, but that's the nature of an ad-hoc schedule.  I also got to meet some excellent analytics bods and friends old and new.  Many thanks to all the organisers!

My sessions on machine learning

After finishing my presentation I discovered I would need to talk waaay to quickly to fit it all in, so I decided to do a session on each example I had.  The presentation is now available online here, so you can see what was intended.

I got some great feedback, as well as requests from people who had missed the session for some details, so this blog post will try to fill in some detail around the presentation we spoke about in the sessions.

Session 1: Introduction, Google Analytics Data and Random Forest Example

Introduction

Machine Learning gives ability for programs to learn without being explicitly programmed for a particular dataset.  They make models from input data to create useful output, commonly predictive analytics. (Arthur Samuel via Wikipedia)

There are plenty of machine learning resources, but not many that deal with web analytics in particular.  The sessions are aimed at inspiring web analysts to use or add machine learning to their toolbox, showing two machine learning examples that detail:
  • What data to extract
  • How to process the data ready for the models
  • Running the model
  • Viewing and assessing the results 
  • Tips on how to put into production
Machine learning isn't magic.  You may be able to make a model that uses obscure features, but a lot of intuition will be lost as a result.  Its much better to have a model that uses features you can understand, and scales up what a domain expert (e.g. you) could do if you had the time to go through all the data.

Types of Machine Learning

Machine learning models are commonly split between supervised and unsupervised learning.  We deal with an example from each:
  • Supervised: Train the model against a test set with known outcomes.  Examples include spam detection and our example today, classifying users based on what they eventually buy.  The model we use is known as Random Forests.
  • Unsupervised: Let the model find own results.  Examples include clustering of users that we do in the second example using the k-means model.

Every machine learning project needs the below elements.  They are not necessarily done in order but a successful project will need to incorporate them all:

  • Pose the question - This is the most important.  We pose a question that our model needs to answer.  We also review this question and may modify it to try and fit what the data can do as we work on the project.
  • Data preparation - This is the majority of work.  It covers getting hold of the data, munging it so it fits the model and parsing the results.  I've tried to include some R functions below that will help with this, including getting the data from Google Analytics into R.
  • Running the model - The sexy statistics part.  Whilst superstar statistics skills is helpful to get the best results, you can still get useful output when applying model defaults which we use today.  Important thing is to understand the methods.
  • Assessing the results - What you’ll be judged on.  You will of course have a measure of how accurate the model is, but an important step is visualising this and being able to explain the model to non-technical people.
  • How to put it into production - the ROI and business impact.  A model that just runs in your R code on your laptop may be of interest, but ultimately not as useful for the business as a whole if it does not recommend how to implement the model and results into production.  Here you will probably need to talk to IT about how to call your model, or even rewrite your prototype into a more production level language.

Pitfalls Using Machine Learning in Web Analytics

There are some considerations when dealing with web analytics data in particular:

  • Web analytics is messy data - definitions can vary from website to website on various metrics, such as unique users, sessions or pageviews, so a through understanding of what you are working with is essential.
  • Most practical analysis needs robust unique userIds - For useful actionable output, machine learning models need to work on data that record useful dimensions, and for most websites that is your users.  Unfortunately that is also the definition that is the most woolly in web analytics given the nature of different access points.  Having a robust unique userID is very useful and made the examples in this blog post possible.
  • Time-series techniques are quickest way in - If you don't have unique users, then you may want to look at time-series models instead, since web analytics is also a lot of count data over time.  This is the reason I did GA Effect as one of my first data apps, since it could apply to most situations of web analytics.
  • Correlating confounders - It can be common for web analytics to be recording highly correlating metrics e.g. PPC clicks and cost.  Watch out for these in your models as they can overweight results.
  • Self reinforcing results - Also be wary of applying models that will favour their own results.  For example, a personalisation algo that places products at the top of the page will naturally get more clicks.  To get around this, consider using weighted metrics, such as a click curve for page links.  Always test.
  • Do regularisation -  Make sure all metrics are on the same scale, otherwise some will dominate.  e.g. pageviews + bounce rate in same model

The Scenario

Here is the situation the following examples are based upon.  Hopefully it will be something familiar to your own case:

You are in charge of a reward scheme website, where existing customers log in to spend their points.  You want users to spend as many points as they can, so they have high perceived value.  You capture a unique userId on login into custom dimension1 and use Google Analytics enhanced e-commerce to track which prizes users view and claim.

Notice this scenario involves the reliable user ID, since every user is logging in to use the website. This may be tricky to do on your own website, so you may need to only work with a subset of your users.  In my view, the data gains you can make from reliable user identification means I try to encourage the design of the website to involve logged in content as much as possible.

    Random Forests

    Now we get into the first example.  Random Forests are a popular machine learning tool as it typically has good results - in Kaggle competitions its often the benchmark to beat.
     
    Random Forests are based on decision trees, and decision trees are the topic of a recent interactive visualisation on machine learning that has been doing the rounds.  Its really great, so check it out first then come back here.

    Back? Ok great, so now you know about decision trees.

    Random Forests are a simple extension, as a collection of decision trees are a Random Forest.  A problem with decision trees is that they will overfit your data - when you throw new data at it you will get misclassification.  It turns out though, that if you aggregate all the decision trees with subsets of your original data, all those slightly worse models added up make one robust model, meaning when you throw new data at a Random Forest its more likely to be a closer fit. 

    If you want more detail check out the very readable original paper by Breiman and Cutler and a tutorial on using it with R is here.

    Example 1: Can we predict what prizes a user will claim from their view history?

    Now we are back looking at our test scenario.  We have noticed that a lot of user's aren't claiming prizes despite browsing the website, and we want to see if we can encourage them to claim prizes, so they value the points more and spend more to get them.

    We want to look at users who do claim, and see what prizes they look at before they claim.  Next we will see if we can build a model to predict what a user will claim based on their view history.  In production, we will use this to e-mail users who have viewed but not claimed prize suggestions, to see if it improves uptake.

    Fetching the data

    Use your favourite Google Analytics to R library - I'm using my experimental new library, googleAnalyticsR, but it doesn't matter which, the important thing is looking at what is being fetched.  In this example the user ID is being captured in custom dimension 1, and we're pulling out the product SKU code.  This is transferable to other web analytics such as Adobe Analytics (perhaps via the RSiteCatalyst package)



    Note we needed two API calls to get the views and transactions as these can't be queried in the same call.  They will be merged later.

    Transforming the data

    We now need to put the data into a format that will work with Random Forests.  We need a matrix of predictors to feed into the model, one column of response showing the desired output labels, and we split it so it is one row per user action:
    Here is some R code to "widen" the data to get this format. We then split the data set randomly 75% for training, 25% for testing.

    Running RandomForest and assessing the results

    We now run the model - this can take a long time for lots of dimensions (this can be much improved using PCA for dimension reduction, see later).  We then test the model on the test data, and get an accuracy figure:


    On my example test set I got ~70% accuracy on this initial run, which is not bad, but it is possible to get up to 90-95% with some tweaking.  Anyhow, lets plot the test vs predicated product frequencies, to see how it looks:


    This outputted the below plot.  It can be seen in general the ~70% accuracy predicted many products but with a lot of error happening for a large outlier.  Examining the data this product SKU was for a cash only prize.  A next step would be to look at how to deal with this product in particular since eliminating it improves accuracy to ~85% in one swoop.
     

    Next steps for the RandomForest

    There I stop but there are lots of next steps that could be done to make the model applicable to the business.  A non-exhaustive list is:

    • Run model on more test sets
    • Train model on more data
    • Try reducing number of parameters (see PCA later)
    • Examine large error outliers 
    • Compare with simple models (last/first product viewed?) - complicated is not always best!
    • Run model against users who have viewed and not sold yet
    • Run email campaign with control and model results for final judgement

    It is hoped the above inspired you to try it yourself.

    Session 2: K-means, Principal Component Analysis and Summary

    Example 2: Can we cluster users based on their view product behaviour?

    Now we look at k-means clustering.  The questions we are trying to answer are something like this:

    Do we have suitable prize categories on the website? How do our website categories compare to user behaviour?

    The k-means clustering we hope will give us data to help with decisions on how the website is organised.

    For this we will use the same data as we used before for Random Forests, with some minor changes: as k-means is an unsupervised model we will take off our product labels:

    A lot of this example is inspired by this nice beginners walk-through on K-means with R.

    Introduction to K-means clustering


    This video tutorial on k-means explains it well:



    The above is an example with two dimensions, but k-means can apply to many more dimensions than that, we just can't visualise them easily. In our case we have 185 product views that will each serve as a dimension.  However, problems with that many dimensions include long processing time alongside dangers of over-fitting the data, so we now look at PCA.

    Principal Component Analysis (PCA)

    We perform Principal Component Analysis (PCA) to see if there are important products that dominate model - this could have been applied to previous Random Forest example as well, and indeed a final production model could include output from one model like k-means to be fed into Random Forests.

    PCA rotates dimensions to try and minimize them as much as possible, then ranks them in amount of variance.  There is a good visualisation of this here.

    The clustering we will do will actually be performed on the top rotated dimensions we find via PCA, and we will then map these back to the original pages for final output. This also takes care of situations such as if one product is always viewed in every cluster: PCA will minimize this dimension.

    The code below looks for the principal components, then gives us some outputs to try and decide how many dimensions we will choose.  A rule of thumb is we look for components that give us roughly ~85% of the variance.   For the below data this was actually 35 dimensions (reduced from the 185 before)  



    The plot output from the above is below.  We can see the first principal component accounts for 50% of the variance, but then the variation is flattish.


    How many clusters?

    How many clusters to pick for k-means can be a subjective experience.  There are other clustering models that pick for you, but some kind of decision process will be dependent on what you need.  There are however ways to help inform that decision.

    Running the k-means modelling for increasing number of clusters, we can look at an error measure (sum of squares) of how many points are in each.  When we plot these attempts for each cluster iteration, we can see how the graph changes or levels off at various cluster sizes, and use that to help with our decision:


    The plot for determining the clusters is here - see the fall between 2-4 clusters.  We went with 4 for this example, although a case could be made for 6:


    Assessing the clusters and visualisation

     I find heatmaps are a good way to assess clustering results, since they offer a good way to overview groupings.  We are basically looking to see if the clusters found are different enough to make sense.


    This gives the following visualisation.  In an interactive RStudio or Shiny session, this is zoomable for finer detail, but here we just exported the image:

    From the heatmap we can see that each cluster does have distinctly different product views.

    K-Means - Next Steps

    The next step is to take these clusters and examine the products that are within them, looking for patterns.  This is where your domain knowledge is needed, as all we have done here is grouped together based on statistics - the "why" is not in here.  When I've performed this in the past, I try to give named persona to each cluster type.  Examples include "Big Spenders" for those who visit the payment page a lot, "Sport Freaks" who tend to only look at sport goods etc.  Again, this will largely depend on the number of clusters you have chosen, so you may want to vary this to tweak to the results you are looking for.

    Recommendations follow on how to group pages: A/B teats can then be performed to test if the clustering makes an impact.

    Summary

    I hope the above example workflows have inspired you to try it with your own data.  Both examples can be improved, for instance we took no account of the order of product views or other metrics such as time on website, but the idea was to give you a way in to try these yourselves.

    I chose k-means and Random Forests as they are two of the most popular models, but there are lots to choose from.  This diagram from a python machine learning library, scikit-learn, offers an excellent overview on how to choose which other machine learning model you may want to use for your data:

    All in all I hope some of the mystery around machine learning has been taken out, and how it can be applied to your work.  If you are interested in really getting to grips with machine learning, the Coursera course was excellent and what set me on my way.

    Do please let me know of any feedback, errors or what you have done with the above, I'd love to hear from you.

    Good luck!
    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/895345 2015-08-19T08:07:14Z 2016-09-05T20:23:34Z Google API Client Library for R: googleAuthR v0.1.0 now available on CRAN

    One of the problems with working with Google APIs is that quite often the hardest bit, authentication, comes right at the start.  This presents a big hurdle for those who want to work with them, it certainly delayed me.  In particular having Google authentication work with Shiny is problematic, as the token itself needs to be reactive and only applicable to the user who is authenticating.

    But no longer! googleAuthR provides helper functions to make it easy to work with Google APIs.  And its now available on CRAN (my first CRAN package!) so you can install it easily by typing:

    > install.packages("googleAuthR")

    It should then load and you can get started by looking at the readme files on Github or typing:

    > vignette("googleAuthR")

    After my experiences making shinyga and searchConsoleR, I decided inventing the authentication wheel each time wasn't necessary, so worked on this new R package that smooths out this pain point.

    googleAuthR provides easy authentication within R or in a Shiny app for Google APIs.  It provides a function factory you can use to generate your own functions, that call or do the actions you needed.

    At last counting there are 83 APIs, many of which have no R library, so hopefully this library can help with that.  Examples include the Google Prediction API, YouTube analytics API, Gmail API etc. etc.

    Example using googleAuthR

    Here is an example of making a goo.gl R package using googleAuthR:

    If you then want to make this multi-user in Shiny, then you just need to use the helper functions provided:





    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/892001 2015-08-10T08:15:49Z 2017-09-21T09:50:11Z Automating Google Console search analytics data downloads with R and searchConsoleR

    Yesterday I published version 0.1 of searchConsoleR, a package that interacts with Google Search Console (formerly Google Webmaster Tools) and in particular its search analytics.

    I'm excited about the possibilities with this package, as this new improved data is now available in a way to interact with all the thousands of other R packages.

    If you'd like to see searchConsoleR capabilities, I have the package running an interactive demo here (very bare bones, but should demo the data well enough).

    The first application I'll talk about in this post is archiving data into a .csv file, but expect more guides to come, in particular combining this data with Google Analytics.

    Automatic search analytics data downloads

    The 90 day limit still applies to the search analytics data, so one of the first applications should be archiving that data to make year on year, month on month and general development of your SEO rankings.

    The below R script:

    1. Downloads and installs the searchConsoleR package if it isn't installed already.
    2. Lets you set some parameters you want to download.
    3. Downloads the data via the search_anaytics function.
    4. Writes it to a csv in the same folder the script is run in.
    5. The .csv file can be opened in Excel or similar.

    This should give you nice juicy data.

    Considerations

    The first time you will need to run the scr_auth() script yourself so you can give the package access, but afterwards it will auto-refresh the authentication each time you run the script.

    If you ever need a new user to be authenticated, run scr_auth(new_user=TRUE)

    You may want to modify the script so it appends to a file instead, rather than having a daily dump, although I do this with a folder of .csv's to import them all into one R dataframe (which you could export again to one big .csv)

    Automation

    You can now take the download script and use it in automated batch files, to run daily.

    In Windows, this can be done like this (from SO)

    • Open the scheduler: START -> All Programs -> Accessories -> System Tools -> Scheduler
    • Create a new Task
    • under tab Action, create a new action
    • choose Start Program
    • browse to Rscript.exe which should be placed e.g. here:
      "C:\Program Files\R\R-3.2.0\bin\x64\Rscript.exe"
    • input the name of your file in the parameters field
    • input the path where the script is to be found in the Start in field
    • go to the Triggers tab
    • create new trigger
    • choose that task should be done each day, month, ... repeated several times, or whatever you like

    In Linux, you can probably work it out yourself :)

    Conclusion

    Hopefully this shows how with a few lines of R you can get access to this data set.  I'll be doing more posts in the future using this package, so if you have any feedback let me know and I may be able to post about it.  If you find any bugs or features you would like, please also report an issue on the searchConsoleR issues page on Github.

    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/882692 2015-07-19T18:36:20Z 2017-06-30T14:57:11Z Enhance Your Google Analytics Data with R and Shiny (Free Online Dashboard Template)

    Introduction

    The aim of this post is to give you the tools to enhance your Google Analytics data with R and present it on-line using Shiny.  By following the steps below, you should have your own on-line GA dashboard, with these features:

    • Interactive trend graphs.

    • Auto-updating Google Analytics data.

    • Zoomable day-of-week heatmaps.

    • Top Level Trends via Year on Year, Month on Month and Last Month vs Month Last Year data modules.

    • A MySQL connection for data blending your own data with GA data.

    • An easy upload option to update a MySQL database.

    • Analysis of the impact of marketing events via Google's CausalImpact.

    • Detection of unusual time-points using Twitter's Anomaly Detection.

    A lot of these features are either unavailable in the normal GA reports, or only possible in Google Analytics Premium.  Under the hood, the dashboard is exporting the data via the Google Analytics Reporting API, transforming it with various R statistical packages and then publishing it on-line via Shiny.

    A live demo of the dashboard template is available on my Shinyapps.io account with dummy GA data, and all the code used is on Github here.

    Feature Detail

    Here are some details on what modules are within the dashboard.  A quick start guide on how to get the dashboard running with your own data is at the bottom.

    Trend Graph

    Most dashboards feature a trend plot, so you can quickly see how you are doing over time.  The dashboard uses dygraphs javascript library, which allows you to interact with the plot to zoom, pan and shift your date window.  Plot smoothing has been provided at the day, week, month and annual level.

    Screen Shot 2015-07-17 at 221225png

    Additionally, the events you upload via the MySQL upload also appear here, as well as any unusual time points detected as anomalies.  You can go into greater detail on these in the Analyse section.

    Heatmap

    Heatmaps use colour intensity to show metrics between categories.  The heatmap here is split into weeks and day per week, so you can quickly scan to see if a particular day of the week is popular - in the below plot, Monday/Tuesday look like they are best days for traffic.  

    Screen Shot 2015-07-19 at 114241png

    The data window is set by what you select in the trend graph, and you can zoom for more detail using the mouse.

    Top Level Trends

    Quite often headlines just need a number to quickly check.  These data modules give you a quick glance into how you are doing, comparing last week to the week before, last month to the month before and last month to the same month the year before.  Between them, you should see how your data is trending, accounting for seasonal variation.

    Screen Shot 2015-07-19 at 114335png

    MySQL Connection

    The code provides functions to connect to a MySQL database, which you can use to blend your data with Google Analytics, provided you have a key to link them on.  

    Screen Shot 2015-07-17 at 221124png

    In the demo dashboard the key used is simply the date, but this can be expanded to include linking on a userID from say a CRM database to the Google Analytics CID, Transaction IDs to off-line sales data, or extra campaign information to your campaign IDs.  An interface is also provided to let end users update the database by uploading a text file.

    CausalImpact

    In the demo dashboard, the MySQL connection is used to upload Event data, which is then used to compare with the Google Analytics data to see if the event had a statistically significant impact on your traffic.  This replicates a lot of the functionality of the GA Effect dashboard.

    Screen Shot 2015-07-17 at 221136png

    Headline impact of the event is shown in the summary dashboard tab.  If its statistically significant, the impact is shown in blue.

    Screen Shot 2015-07-19 at 160843png

    Anomaly Detection

    Twitter has released this R package to help detect unusual time points for use within their data streams, which is also handy for Google Analytics trend data.  

    Screen Shot 2015-07-17 at 221151png

    The annotations on the main trend plot are indicated using this package, and you can go into more detail and tweak the results in the Analyse section.

    Making the dashboard multi-user

    In this demo I’ve taken the usual use case of an internal department just looking to report on one Google Analytics property, but if you would like end users to authenticate with their own Google Analytics property, it can be combined with my shinyga() package, which provides functions which enable self authentication, similar to my GA Effect/Rollup/Meta apps.

    In production, you can publish the dashboard behind a Shinyapps authentication login (needs a paid plan), or deploy your own Shiny Server to publish the dashboard on your company intranet.

    Quick Start

    Now you have seen the features, the below goes through the process for getting this dashboard for yourself. This guide assumes you know of R and Shiny - if you don’t then start there: http://shiny.rstudio.com/

    You don’t need to have the MySQL details ready to see the app in action, it will just lack persistent storage.

    Setup the files

    1. Clone/copy-paste the scripts in the github repository to your own RStudio project.

    2. Find your GA View ID you want to pull data from.  The quickest way to find it is to login to your Google Analytics account, go to the View then look at the URL: the number after “p” is the ID.

    3. [Optional] Get your MySQL setup with a user and IP address. See next section on how this is done using Google Cloud SQL.  You will also need to white-list the IP of where your app will sit, which will be your own Shiny Server or shinyapps.io. Add your local IP for testing too. If using shinyapps.io their IPs are: 54.204.29.251; 54.204.34.9; 54.204.36.75; 54.204.37.78.

    4. Create a file called secrets.R file in the same directory as the app with the below content filled in with your details.  

    Configuring R

        1. Make sure you can install and run all the libraries needed by the app:

        2. Run the below command locally first, to store the auth token in the same folder.  You will be prompted to login with the Google account that has access to the GA View ID you put into step 3, and get a code to paste into the R console.  This will then be uploaded with app and handle the authentication with Google Analytics when in production:

            > rga::rga.open(where="token.rga")

        3. Test the app by hitting the “Run App” button at the top right of the server.ui script in RStudio, or by running:

            > shiny::runApp()

    Using the dashboard

    1. The app should now be running locally in a browser window with your own GA data.  It can take up to 30 seconds for all the data to load first time.

    2. Deploy the instance on-line to Shinyapps.io with a free account there, or to your own Shiny Server instance.

    3. Customise your instance. If for any reason you don’t want certain features, then remove the feature in the ui.R script - the data is only called when the needed plot is viewed.

    Getting a MySQL setup through Google Cloud SQL

    If you want a MySQL database to use with the app, I use Google Cloud SQL.  Setup is simple:
    1. Go to the Google API console and create a project if you need to.

    2. Make sure you have billing turned on with your billing accounts menu top right.

    3. Go to Storage > Cloud SQL in the left hand menu.

    4. Create a New Instance.

    5. Create a new Database called “onlinegashiny”

    6. Under “Access Control” you need to put in the IP of yourself where you test it, as well as the IPs of the Shiny Server/shinyapps.io.  If you are using shinyapps.io the IPs are: 54.204.29.251; 54.204.34.9; 54.204.36.75;54.204.37.78

    7. Under “IP Address” create a static IP (Charged at $0.24 a day)

    8. You now should have all the access info you need to put in the apps secrets.R for MySQL access.  The port should be a default 3306

    9. You can also limit the amount of data that is uploaded by the shiny.maxRequestSize option - default is 0.5 MB.

    Summary

    Hopefully the above could help inspire what can be done with your Google Analytics data.  Focus has been on trying to give you the tools that allow action to be made on your data.

    There is a lot more you can do via the thousands of R packages available, but hopefully this gives a framework you can build upon.

    I’d love to see what you build with it, so do please feel free to get in touch. :)
    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/846679 2015-04-25T18:57:47Z 2015-05-24T18:14:42Z My new role as Google Developer Expert for Google Analytics!

    I'm very pleased and honoured to have been accepted into the Google Developer Expert program representing Google Analytics.  I should soon have my mug listed with the other GA GDE's at the Google Developer Expert website.

    My thanks go to Simo who nominated me and Linda for helping me through the application process.

    Alongside my existing work at Wunderman, my role should include some more opportunities to get out there and show what can be done with the GA APIs, so expect me at more analytics conferences soon.

    I also will get to play with some of the new betas and hopefully be able to create more cool demo apps for users to adapt and use for their own website, mostly using R Shiny and Google App Engine.

    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/813652 2015-02-20T10:28:23Z 2016-09-02T06:52:25Z How I made GA Effect - creating an online statistics dashboard using R

    GA Effect is a webapp that uses Bayesian structural time-series to judge if events happening in your Google Analytics account are statistically significant.  Its been well received on Twitter and how to use it is detailed in this guest post on Online Behaviour, but this blog will be about how to build your own or similar.

    Update 18th March: I've made a package that holds a lot of the functions below, shinyga.  That may be easiest to work with.

    Screen Shot 2015-02-19 at 205931png

    What R can do

    Now is a golden time for the R community, as it gains popularity outside of its traditional academic background and hits business.  Microsoft has recently bought Revolution Analytics, an enterprise solution of R so we can expect a lot more integration with them soon, such as the machine learning in their Azure platform.

    Meanwhile RStudio are releasing more and more packages that make it quicker and easier to create interactive graphics, with tools for connecting and reshaping data and then plotting using attractive JavaScript visualisation libraries or native interactive R plots.  GA Effect is also being hosted using ShinyApps.io, an R server solution that enables you to publish straight from your console, or you can run your own server using Shiny Server.  

    Packages Used

    For the GA Effect app, the key components were these R packages:

    Putting them together

    Web Interaction

    First off, using RStudio makes this all a lot easier as they have a lot of integration with their products.

    ShinyDashboard is a custom theme of the more general Shiny.  As detailed in the getting started guide, creating a blank webpage dashboard with shinydashboard take 8 lines of R code.  You can test or run everything locally first before publishing to the web via the “Publish” button at the top.  

    Probably the most difficult concept to get around is the reactive programming functions in a Shiny app.  This is effectively how the interaction occurs, and sets up live relationships between inputs from your UX script (always called ui.R) and outputs from your server side scripts (called server.r).  These are your effective front-end and back-end in a traditional web environment.  The Shiny packages takes your R code and changes it into HTML5 and JavaScript. You can also import JavaScript of your own if you need it to cover what Shiny can’t.

    The Shiny code then creates the UI for the app, and creates reactive versions of the datatables needed for the plots.

    Google Authentication

    The Google authentication flow uses OAuth2 and could be used for any Google API in the console, such as BigQuery, Gmail, Google Drive etc.  I include the code used for the authentication dance below so you can use it in your own apps:

    Fetching Google Analytics Data

    Once a user has authenticated with Google, the user token is then passed to rga() to fetch the GA data, according to which metric and segment the user has selected. 

    This is done reactively, so each time you update the options a new data fetch to the API is made.  Shiny apps are on a per user basis and work in RAM, so the data is forgotten once the app closes down.

    Doing the Statistics

    You can now manipulate the data however you wish.  I put it through the CausalImpact package as that was the application goal, but you have a wealth of other R packages that could be used such as machine learning, text analysis, and all the other statistical packages available in the R universe.  It really is only limited by your imagination. 

    Here is a link to the CausalImpact paper, if you really want to get in-depth with the methods used.  It includes some nice examples of predicting the impact of search campaign clicks.

    Here is how CausalImpact was implemented as a function in GA Effect:

    Plotting

    dygraphs() is an R package that takes R input and outputs the JavaScript needed to display it in your browser, and as its made by RStudio they also made it compatible with Shiny.  It is an application of HTMLwidgets, which lets you take any JavaScript library and make it compatible with R code.  Here is an example of how the main result graph was generated:

    Publishing

    I’ve been testing the alpha of shinyapps.io for a year now, but it is just this month (Feb 2015) coming out of beta.  If you have an account, then publishing your app is as simple as pushing “Publish” button above your script, where it appears at a public URL.  With other paid plans, you can limit access to authenticated users only.

    Next steps

    This app only took me 3 days with my baby daughter on my lap during a sick weekend, so I’m sure you can come up with similar given time and experience.  The components are all there now to make some seriously great apps for analytics.  If you make something do please let me know!

    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/806778 2015-02-04T10:45:54Z 2017-08-09T16:20:34Z OSX Black Screen no Login screen but with working cursor on boot [fixed]

    I'm just posting this, to maybe help others who get the same problem.

    I had an OSX 10.10.2 update on my 2011 Macbook Air, and left the laptop open last night.  This put it in Hibernation mode which breaks the auto-installation, so when I tried to use the laptop this morning, it booted to the Apple logo, but then the screen went totally black without the option to login.  The cursor was still live though.

    The fix below will let you login again.  It will only work in the above scenario, if its your backlight broken or something else keep searching :)

    Before the below fix I tried:

    1. Pressing the increase brightness buttons (duh)
    2. Restarting in safe mode (doesn't complete login)
    3. Resetting SMC and PRAM (pusing CTRL+OPTION+POWER+other buttons on powerup - see here: https://discussions.apple.com/docs/DOC-3603 )
    4. Letting it boot, waiting, then pushing first letter of your username, pushing enter and typing in password (the most popular fix on the web)

    But finally, the solution was found at this forum called Jamfnation via some Google-wu:

    1. Perform a PRAM reset ( Cmd+Option+P+R ) on boot – let chime 3 times and let go
    2. Boot to Single User Mode (hold Command+S immediately after powering on)
    3. Verify and Mount the Drives - Once in Single user mod, run the following commands:
    4. /sbin/fsck -fy
    5. /sbin/mount -uw /
    6. After the disk has mounted in step 5, run the following commands:
    7. rm -f /Library/Preferences/com.apple.loginwindow.plist
    8. rm -f /var/db/.AppleUpgrade
    9. After deleting the files, restart.

    Hope it helps if you get this far.

    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/774052 2014-11-23T13:37:29Z 2015-05-22T15:58:37Z E-mail open rate tracking with Google Analytics' Measurement Protocol - Demo

    Edit 4th Feb 2015 - Google have published an email tracking guide with the Measurement Protocol.  The below goes a bit beyond that showing how to link the user sessions etc.

    The Measurement Protocol was launched at the same time as Universal Analytics, but I've seen less adoption of it with clients, so this post is an attempt to show what can be done with it with a practical example.

    The demo app is available here: http://ua-post-to-push.appspot.com/

    With this demo you should be able to track the following:

    1. You have an email address from an interested customer
    2. You send them an email and they look at it, but don't click through.
    3. Three days later they open the email again at home, and click through to the offer on your website.
    4. They complete the form on the page and convert.

    Within GA, you will be able to see for that campaign 2 opens, 1 click/visit and 1 conversion for that user.  As with all email open tracking, you are dependent on the user downloading the image, which is why I include the option to upload an image and not just a pixel, as it may be more enticing to allow images in your newsletter.

    Intro

    The Measurement Protocol lets you track beyond the website, without the need of client-side JavaScript.  You construct the URL and when that URL is loaded, you see the hit in your Google Analytics account.  That's it. 

    The clever bit is that you can link user sessions together via the CID (Customer ID), so you can track the upcoming Internet of Things off-line to on-line, but also things like email opens and affiliate thank you pages.  It also works with things like enhanced e-commerce, so can be used for customer refunds or product impressions. 

    This demo looks at e-mail opens for its example, but its minor modifications to track other things.  For instance, I use a similar script to measure in GA when my Raspberry Pi is backing up our home computers via Time Machine.

    Demo on App Engine

    To use the Measurement Protocol in production most likely needs server-side code.  I'm running a demo on Google App Engine coded in Python, which is pretty readable so should make it fairly easy for a developer to replicate in their favourite language.  App Engine is also a good choice if you are wanting to run it in production, since it has a free tier for tracking 1000s of email opens a day, but scalability to handle millions.

    This code is available on Github here - http://github.com/MarkEdmondson1234/ga-get-to-post

    App running that code is here: http://ua-post-to-push.appspot.com/

    There are instructions on Github on how it works, but I'll run through some of the key concepts here in this post.

    What the code does

    The example has four main URLs:
    • The homepage explaining the app
    • The image URL itself, that when loaded creates the hit to GA
    • A landing page with example custom GA tracking script
    • An upload image form to change the image you would display in the e-mail.
    The URLs above are controlled server side with the code in main.py

    Homepage

    This does nothing server side aside serve up the page



    Image URL


    This is the main point of the app - it turns a GET request for the image uploaded into a POST with the parameters found in the URL.  It handles the different options and sends the hit to GA as a virtual pageview or event, with a unique user CID and campaign name. An example URL here is:
    http://your-appengine-id.appspot.com/main.png?cid=blah&p=1&c=email_campaign



    Landing Page


    This does little but take the cid you put in the email URL, and outputs the CID that will be used in Google Analytics.  If this is the same CID as in the image URL and the user clicks in the email, those sessions will be linked. You can also add the GA campaign parameters, but the sever side script ignores those - the javascript on the page will take care of it. An example URL here is:
    http://your-appengine-id.appspot.com/landing-page?cid=blah&utm_source=source_me&utm_medium=medium_me&utm_campaign=campaign_me


    The CID in the landing page URL is then captured and turned into an anonymous CID for GA.  This is then served up to the Universal Analytics JavaScript on the landing page, shown below.  Use the same UA code for both, else it won't work (e.g. UA-123456-1)


    Upload Image

    This just handles the image uploading and serves the image up via App Engines blobstore.  Nothing pertinent to GA here so see the Github code if interested.

    Summary

    Its hoped this helps sell using the Measurement Protocol to more developers, as it offers a solution to a lot of the problems with digital measurement today, such as attribution of users beyond the website.  The implementation is reasonably simple, but the power is in what you send and what situations.  Hopefully this inspires what you could do with your setup.

    There are some limitations to be aware of - the CID linking won't stitch sessions together, it just discards a user's old CID if they already had one, so you may want to look at userID or how to customise the CID for users who visit your website first before the email is sent.  The best scenario would be if a user is logged in for every session, but this may not be practical.  It may be that the value of linking sessions is so advantageous in the future, entire website strategies will be focused on getting users to ID themselves, such as via social logins.

    Always consider privacy: look for user's to opt in, and make sure to use GA filters to take out any PII you may put into GA as a result.  Current policy looks to be that if the data within GA is not able to be tracked to an individual (e.g. a name, address or email) then you are able to record an anonymous personal ID, that could be exported and linked to PII outside of GA.  This is a bit of a shifting target, but in all cases keeping it as user focused and not profit focused as possible should see you through any ethical questions.






    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/763941 2014-11-02T11:23:12Z 2016-02-28T04:14:29Z Finding the ROI of Title tag changes using Google's CausalImpact R package

    After a conversation on Twitter about this new package, and mentioning it in my recent MeasureCamp presentation, here is a quick demo on using Google's CausalImpact applied to an SEO campaign.

    CausalImpact is a package that looks to give some statistics behind changes you may have done in a marketing campaign.  It examines the time-series of data before and after an event, and gives you some idea on whether any changes were just down to random variation, or the event actually made a difference.

    You can now test this yourself in my Shiny app that automatically pulls in your Google Analytics data so that you can apply CausalImpact to it.   This way you can A/B test changes for all your marketing channels, not just SEO.  However, if you want to try it manually yourself, keep reading.

    Considerations before getting the data

    Suffice to say, it should only be applied to time-series data (e.g. there is date or time on the x-axis), and it helps if the event was rolled out on only one of those time points.  This may influence the choice of time unit you use, so if say it rolled out over a week its probably better to use weekly data exports.  Also consider the time period you choose.  The package will use the time-series before the event to construct what it thinks should happen vs what actually happened, so if anything unusual or spikes occur in the test period it may affect your results.

    Metrics wise the example here is with visits.  You could perhaps do it with conversions or revenue, but then you may get affected by factors outside of your control (the buy button breaking etc.), so for clean results try to take out as many confounding variables as possible. 

    Example with SEO Titles

    For me though, I had an example where some title tag changes went live on one day, so could compare the SEO traffic before and after to judge if it had any effect, and also more importantly judge how much extra traffic had increased.

    I pulled in data with my go-to GA R import library, rga by Skardhamar.

    Setup

    I first setup, importing the libraries if you haven't got them and authenticating the GA account you want to pull data from.

    Import GA data

    I then pull in the data for the time period covering the event.  SEO Visits by date.

    Apply CausalImpact

    In this example, the title tags got updated on the 200th day of the time-period I pulled.  I want to examine what happened the next 44 days.

    Plot the Results

    With the plot() function you get output like this:

    1. The left vertical dotted line is where the estimate on what should have happened is calculated from.
    2. The right vertical dotted line is the event itself. (SEO title tag update)
    3. The original data you pulled is the top graph.
    4. The middle graph shows the estimated impact of the event per day.
    5. The bottom graph shows the estimated impact of the event overall.

    In this example it can be seen that after 44 days there is an estimated 90,000 more SEO visits from the title tag changes. This then can be used to work out the ROI over time for that change.

    Report the results

    The $report method gives you a nice overview of the statistics in a verbose form, to help qualify your results.  Here is a sample output:

    "During the post-intervention period, the response variable had an average value of approx. 94. By contrast, in the absence of an intervention, we would have expected an average response of 74. The 95% interval of this counterfactual prediction is [67, 81]. Subtracting this prediction from the observed response yields an estimate of the causal effect the intervention had on the response variable. This effect is 20 with a 95% interval of [14, 27]. For a discussion of the significance of this effect, see below.

    Summing up the individual data points during the post-intervention period (which can only sometimes be meaningfully interpreted), the response variable had an overall value of 4.16K. By contrast, had the intervention not taken place, we would have expected a sum of 3.27K. The 95% interval of this prediction is [2.96K, 3.56K].

    The above results are given in terms of absolute numbers. In relative terms, the response variable showed an increase of +27%. The 95% interval of this percentage is [+18%, +37%].

    This means that the positive effect observed during the intervention period is statistically significant and unlikely to be due to random fluctuations. It should be noted, however, that the question of whether this increase also bears substantive significance can only be answered by comparing the absolute effect (20) to the original goal of the underlying intervention.

    The probability of obtaining this effect by chance is very small (Bayesian tail-area probability p = 0.001). This means the causal effect can be considered statistically significant."

    Next steps

    This could then be repeated for things like UX changes, TV campaigns, etc. You just need the time of the event and the right metrics or KPIs to measure them against.

    The above is just a brief intro, there is a lot more that can be done with the package including custom models etc, for more see the package help file and documentation.

    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/757473 2014-10-20T07:30:05Z 2017-02-20T15:38:42Z My Google Webmaster Tools Downloader app

    Here is a tool that I have used for SEO analytics, that I am now making publicly available. It extends Google Webmaster Tools to help answer common SEO questions more easily.

    Visit the Google Webmaster Tools Downloader

    Here are a few example questions it helps answer:

    • SEO keyword rankings taking into account personalisation and localisation for Google, in this age of (not provided)
    • SEO keyword performance beyond the 90 days available by default e.g. year on year comparisons
    • How a segment of keywords have performed over time e.g. brand vs non-brand
    • How click through rates change over time e.g. after a website migration.
    • How new/old website sections perform in Google search via the Top Pages reports

    These things were a lot easier before (not provided) took keywords out of web analytics.  This left Google Webmaster Tools as the only reliable source of rankings, but it was not an ideal replacement, with limitations that needed to be worked around by downloading data via an API - an API that rarely gets updated.

    I'm aware this app could quickly become obsolete if Google updated GWT, but it has also served as a great project for me to get to know working with App Engine, jinja2 templating, Google Charts, caching, Stripe, Bootstrap, etc. so its all been worthwhile - I think I can safely say its been the most educational project I've done, and can serve as another template for more sophisticated APIs (the Google Tag Manager API is in sights)

    Its also my first app that will be charged for, simply because keeping a daily breakdown of keywords in a database carries a cost, which is probably why Google don't offer it for free at the moment. There are other web apps on the market that do downloads for free, but I am wary of those by the adage "if you don't pay for a service, you pay with your data".

    I plan to follow it up with more deeper features, including Tableau examples of what you can do with this data once you have it at such a deep level. 

    For now, if you want to sign up to test the alpha, please check out the signup page here

    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/744570 2014-09-21T12:00:52Z 2014-11-25T19:25:05Z My Weekend Away To London at #MeasureCamp

    This weekend I was in London for MeasureCamp, which is an analytics (un)conference with the emphasis on practical advice and knowledge sharing.  You can follow the tweets on their hashtag here: #MeasureCamp

    It was my first conference in 5 years, since my general opinion on conferences are they are largely a waste of time apart from meeting up with people.  But my tweetstream was filled with talk of this new conference format with no sponsored content; where all the sessions are provided by the attendees; "the law of two feet" encouraging a culture of being able to walk out of a session if it was too boring; and free food.  Since I'm always hungry, and the attendees all are world experts in their field, the content looked like it would be top-rate.

    After the morning introduction, there was a scrum to get your session up on the board.

    Again from Twitter feedback, I was encouraged to prepare a practical session on using R for a digital analyst.  My session in the morning was a bit nerve racking, but I managed to finish on time and have had great feedback, thanks very much to all who attended.  My presentation is here: Using R in a digital analytics workflow

    I wish I could have attended more of the sessions such as Simo's GTM wizardry or @fastblokes Embed API session, but of the ones I did attend I generally got inspired or reassured on what we're doing at work.  One thing I'd of liked to see more of was how people are approaching attribution, but its a pretty old subject and maybe people were a bit loathe doing sessions on that. 

    The venue was great, and free food and beer was provided.  Lots of rooms ranging big and small with a central area to bump into people.

    My quick review of my sessions:

    • Unifying Customer Data - this actually was probably my worst session of the day, as it seemed a bit salesy and not very clear on what the take aways were.  I left early to prepare for my session
    • Custom dimensions and metrics - a discussion on what people were using custom metrics/dimensions for.  No real surprises.
    • From Anonymous to Identify - interesting discussion on how once you ID a user if you could use older data on that same cookie to reliably make offers to the user.  Probably could if you don't need a 100% match, but for making financial offers you really need that 100% authentication post-login.
    • Backroom to boardroom - probably most useful take-aways for me on the day, on how to communicate the tools of analytics to the C-suite to get investment.
    • Retargeting discussion - comparing notes on how other's are utilising remarketing.  Was a bit focused to the GA segments remarketing lists and not other platforms such as AdForm, but interesting to hear what people are doing.
    • Pings to Predictions - talk by the tech team of Reach.ly on how the are making a real-time, machine learning behavioural analytics tool.  Was very cool, and a nice team from Latvia I met in the pub the night before and afterwards.

    But the sessions were just a part of it, around them were lots of meeting up with people I've know digitally for a long time but met face-to-face, old friends, new friends from Poland, France, Denmark and UK.

    All in all I would recommend going, and look forward to being able to attend again.  Thanks to all the #MeasureCamp team on their hard work and enthusiasm :)

    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/722553 2014-08-03T20:45:13Z 2014-08-03T20:45:13Z Two Tone Rag - An amble around soundscapes.

    I liked the sound of an organ on another track, so I gave it four tracks of its own, with drums.  Needs real drums one day - Cem?

    It veers off half-way through into a slowed down 80s theme tune. I like listening to it, it calms me. That may be the crotales at the end. 




    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/708747 2014-07-30T14:22:00Z 2017-09-08T16:24:20Z Run R, RStudio and OpenCPU on Google Compute Engine [free VM image]

    File this under "what I wished was on the web whilst trying to do this myself."

    edit 20th November, 2016 - now everything in this post is abstracted away and available in the googleComputeEngineR package - I would say its a lot easier to use that.  Here is a post on getting started with it. http://code.markedmondson.me/launch-rstudio-server-google-cloud-in-two-lines-r/

    edit 30th April, 2016: I now have a new post up on how to install RStudio Server on Google Compute Engine using Docker, which is a better way to do it. 

    edit 30th Nov, 2015: Oscar explains why some users couldn't use their username

    edit 5th October: Added how to login, add users and migrated from gcutil to gcloud

    Google Compute Engine is a very scalable and quick alternative to Amazon Web Services, but a bit less evolved in the images available for users. 

    If you would like to have a VM with R 3.01, RStudio Server 0.98 and OpenCPU installed, then you can click on the link below, and install a pre-configured version for you to build upon.

    With this image, you have a cloud server with the most popular R / Cloud interfaces available, which you can use to apply statistics, machine learning or other R applications on web APIs.  It is a fundamental building block for a lot of my projects.

    The VM image is here. [940.39MB]

    To use, follow these steps:

    Downloading the instance and uploading to your project

    1. Create your own Google Cloud Compute project if you haven't one already.
    2. Put in billing details.  Here are the prices you'll pay for running the machine. Its usually under $10 a month.
    3. Download the image from the link above (and here) and then upload it to your own project's Cloud Storage. Details here
    4. Add the uploaded image to your project with a nice name that is only lowercase, numbers or includes hyphens (-).  Details here. You can do this using gcloud and typing: 
    $ gcloud compute images create IMAGE_NAME --source-uri URI

    Creating the new Instance

    1. Now go to Google Compute Engine, and select Create New Instance
    2. Select the zone, machine type you want (i.e. you can select a 50GB RAM machine if needed for big jobs temporarily)
    3. In the dropdown for images you should be able to see the image from step 4 above.  Here is a screenshot of how it should look, I called my image "r-studio-opencpu20140628"

    Or, if you prefer using command line, you can do the steps above in one command with gcloud like this:

    $ gcloud compute instances create INSTANCE [INSTANCE ...] --image IMAGE

    Using your instance

    You should now have RStudio running on http://your-ip-address/rstudio/ and openCPU running on http://your-ip-address/ocpu/test and a welcome homepage running at the root http://your-ip-address

    To login, your Google username is an admin as you created the Google cloud project. See here for adding users to Google Cloud projects

    If you don't know your username, try this command using gcloud to see your user details:

    $ gcloud auth login

    Any users you add to Debian running on the instance will have a user in RStudio - to log into Debian and add new users, see below:

    $ ## ssh into the running instance
    $ gcloud compute ssh <your-username>@new-instance-name
    $ #### It should now tell you that you are logged into your instance #####
    $ #### Once logged in, add a user: example with jsmith
    $ sudo useradd jsmith
    $ sudo passwd jsmith
    $ ## give the new user a directory and change ownership to them
    $ sudo mkdir /home/jsmith $ sudo chown jsmith:users /home/jsmith

    Oscar in the comments below also explains why sometimes your username may not work:

    Like other comments, my username did not work.

    Rather than creating a new user, you may need to simply add a password to your user account:

    $ sudo passwd .

    Also, the username will be your email address with the '.' replaced with '_'. So xx.yy@gmail.com became xx_yy

    You may also want to remove my default user the image comes with:

    $ sudo userdel markedmondson

    ...and remove my folder:

    $ sudo rm -rf /home/markedmondson

    The configuration used

    If you would like to look before you leap, or prefer to install this yourself, a recipe is below. It largely cobbles together the instructions around the web supplied by these sources:

    Many thanks to them.

    It covers installation on the Debian Wheezy images available on GCE, with the necessary backports:








    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/706026 2014-06-28T19:03:53Z 2017-01-20T15:29:01Z How To Use R to Analyse and Plot Your Twitter Use

    Here is a little how-to if you want to use R to analyse Twitter.  This is the first of two posts: this one talks about the How, the second will talk about the Why.  

    If you follow all the code you should be able to produce plots like this:

    As with all analytic projects its split into four different aspects: 1. getting the data; 2. transformations; 3. analysing; 4. plotting.

    All the code is available on my first public github project:

    https://github.com/MarkEdmondson1234/r-twitter-api-ggplot2

    I did this project to help answer an idea: can I tell by my Twitter when I changed jobs or moved country?

    I have the feeling the more I am doing SEO, the more I rely on Twitter as an information source; whereas for Analytics its more independent research that takes place more on StackOverflow and Github. Hopefully this project can see if this is valid.

    1. Getting the data

    R makes getting tweets easy via the twitteR package.  You need to install that, register your app with Twitter, then authenticate to get access to the Twitter API.

    Another alternative to using the API is to use Twitter's data export, which will then let you go beyond the 3200 limit in the API. This gives you a csv which you can load into R using read.csv()

    2. Transforming the data

    For my purposes, I needed to read the timestamps of the tweets, and put them into early, morning, afternoon and evening buckets, so I could then plot the data.  I also created a few aggregates of the data, to suit what I needed to plot, and these dataframes I outputted from my function in a list.

    Again, as with most analytics projects, this section represents most of the work, with to and fro happening as I tweaked the data I wanted in the chart.  Some tip I've picked up is to try and do these data transformations in a function taking the raw data as an input and outputting your processed data, as it makes it easier to repeat for different data inputs.

    3. Analysing the data

    This will be covered in the second post, and usually is the point of the whole exercise - it only takes about 10% of time on the project, but is the most important.

    4. Plotting the data

    This part evolves as you go to and fro from steps 2-3, but what I ended up with where these functions below.

    theme_mark() is a custom ggplot2 theme you can use if you want the plots to look exactly the same as above, or at the very least show how to customise ggplot2 to your own fonts/colours.  It also uses choosePalette() and installFonts(). "mrMustard" is my name for the colour scheme chosen.

    I use two layers in the plot - one is the area plot to show the total time spent per Day Part, the second is a smoother line to help pick out the trend better for each Day Part.

    plotTweetsDP() takes as input the tweetTD (weekly) or tweetTDm (monthly) dataframes, and plots the daypart dataframe produced by the transformations above.  The timeAxis paramter expects "ym" (yearWeek) or "ym" (yearMonth) which it uses to make the x-axis be more suited to each.

    plotLinksTweets() is the same, but works on the tweetLinks dataframe.


    I hope this is of some use to someone, let me know in the comments!  Also any ideas on where to go from here - at the moment I'm working through some text mining packages to try and get something useful out of those. 

    Again the full project code is available on Github here: https://github.com/MarkEdmondson1234/r-twitter-api-ggplot2

    ]]>
    Mark Edmondson
    tag:markedmondson.me,2013:Post/692033 2014-05-16T15:31:58Z 2014-10-27T19:22:32Z Five Things The UK Could Learn From Denmark, And Vice Versa

    I first arrived in Denmark October 1st, 2010, so I guess I'm qualified enough now to see the highs and lows of living in Denmark versus the UK.  I thought I'd list them here, if only for you to see what my Danish experience is like day to day.

    Five Things The UK Should Learn From Denmark

    I apologise if the below already exist in the UK, it'll be mostly because I came from a small town in deep dark Cornwall, and civilization hasn't reached us down there yet.

    1. A single duvet for each person in a double bed.  This was a revelation.  No more bed wars fighting for your corner of a compromised duvet, your own as you like it but close enough to cuddle your partner.
    2. Supermarket logistics.  First, you can have a shopping basket with little wheels that you pull around behind you, like a shopping trolley but without the commitment. Second, once you arrive at the checkout there is a slidey thing that means you can pack your shopping whilst the customer behind you is served.  Genius. (edit: have been reliably informed by Steph slidey thing is just lacking in Cornish supermarkets.  Oh well.)
    3. No cash.  Everything is paid for by your Dankort (a debit card).  Even things like a coffee or papers etc.  No minimum charges, no whinging from the shopkeeper about transaction costs.
    4. A commitment to bicycles.  50% of people in Copenhagen commute every day.  I have an hour enforced exercise a day, no road rage stress, breathe in clean air, buildings aren't caked in soot, the dull roar of cars is absent.  I think this is a major contributor to why Denmark is voted the happiest country in the world.
    5. Work/Life balance.  Soon after I arrived from the UK and was working in the office, I found myself confused at around 4pm on a Friday.  Looking around I realised - the office was empty.  People had gone home, to eat dinner with their family.  Most had arrived at 8am after taking their kid to school.  I admit this isn't a habit I have yet cultivated, but its implications for the happiest country in the world are obvious.  Its not to say that the Danes are work-shy: it seems there is more personal responsibility given to people to finish their tasks, so they will work late when needed, just not to a prescribed 9-5.  In general there are more "flat" management structures.

    Five Things Denmark Should Learn From The UK

    Obviously I like living in Denmark as I'm still here, but my honeymoon and disillusionment periods all expats seem to have means I can see where improvements could be made, if not just for me for all Danes:

    1. Banter with strangers.  My first months I mistakenly sat in a cafe (the closest I could find to an English pub) with the hope if I frequented it enough, I could strike up conversations with the locals.  All I got was embarrassment from the waiters and uncomfortable glances.  The way to meet people in Denmark is to arrange common interest groups, you can't just rock down to a local pub, as the equivalent isn't frequented by normal people, and even dropping in on a friend without pre-arrangement is uncomfortable for Danes.
    2. Queueing. Its most marked when flying back to and from the UK.  In the UK, the queues for the passport control is based on order of arrival, and doesn't matter whether you're old, young, or English.  When you arrive in Denmark, the queue is based on elbows, size, if you're Danish and amount of determination. Danes will even start a new mini-queue heading off the normal one that the English try to start, just to avoid that extra 2 mins wait. Danish shops have to employ ticketing systems, just to avoid the inevitable ruckus if left to politeness.
    3. Animal Welfare.  The debate in the UK on vegetarianism and fur is light years ahead of Denmark.  This may have been smoke screened by the giraffe incident, where people got upset about lions doing something they do every day, and with animal welfare in mind regarding genetics.  The real story is that Denmark is one of the biggest exporters of fur in the world, as well as the biggest provider of bacon.  Factory farming is prevalent, with techniques such as sow stalls banned in the UK but only just becoming law in Denmark due to EU rulings.
    4. Ease of starting a small business.  My wife is starting up as a games designer, but is often coming up against what seems to be nonsensical blocks to my capitalist soaked British brain.  I have often piped up with suggestions that would work in the UK, but not in Denmark.  It covers things like VAT registration, which is obligatory in Denmark for any business(?) whereas UK need a turnover of £70k and tax credits - Denmark actually seems to encourage staying on welfare.
    5. ......Racial Tolerance? Up until I left the UK I would have ranked this over Denmark, being proud of the UK's integration of cultures from around the world.  But now, with the raise of UKIP and the rhetoric I read from over the channel I question if this value is still part of being British.  I hope so.  The Danes are very protective of being Danish, which for a country of 5.5 million is understandable, but Denmark is also paradoxically suffering from a declining population and shrinking workforce, whilst at the same time making it hard for people outside the EU to become Danish: In 2012, 3,000 people became Danish citizens, compared with 194,000 who became British citizens.  Dual citizenship is not an option (yet) but is being considered, which I will get if I can.  Citizens in the EU have an easier time, so I've been ok, but those outside of the EU I've heard nightmare scenarios, even with good jobs and fluent Danish.  My personal concern is that if the UK talks itself out of the EU my status may be questioned, which in this age of integration seems a step backwards - surely as the world gets more global, nationalism becomes less and less relevant?

    What would you pick, if you had to?  If you're an expat or not, I'll be interested. Seeing the perception of the differences between Denmark and the UK is as interesting as if you have direct experience of both.

      ]]>
      Mark Edmondson
      tag:markedmondson.me,2013:Post/665609 2014-03-24T15:04:45Z 2014-04-01T18:43:21Z Goodbye, NetBooster

      After 400 weeks, 2800 days, 7.67 years and 3 re-brands: I am moving on.

      I've witnessed the company go from a local SEO agency to an international digital agency, and its been an experience that has changed my life for the better in many ways, and I'd like to thank all the people who made it possible along the way. 

      Trying to condense 7.67 years worth of gratitude into one blog post is an impossible task, but I'll like to at least try to chart my journey below, as every one of these have helped shape where I am now (very sorry if I leave anyone out!!):

      • August 2007 - join Neutralize as an "Internet Assistant" in the Tolvaddon Energy Park office, Camborne.
      • I really needed a job as I had been in a band for the past year and had no income, so I turned up for work in my flip-flops and in my big brown VW van, which I had to use all my wages to fill with the diesel to get to work.  But I met a team of people working in this strange field of internet marketing which offered almost boundless opportunities.  From there, I'd like to thank Lucy and Janine for hiring me in the first place, Teddie for infusing the enthusiasm and expertise he would offer over the years afterwards, Mark, Ingo, Chris, Andrew, Adam, Stuart, Martin, Nigel and John for making me feel welcome.
      • May 2008 - Neutralize becomes Guava, and we move into the Nordics. Special fun times for the Falmouth carpoolers, Ingo, Luke, Lotte and the legend that is Daz.  New clients, new Google algos, new adventures with Gary, Rachel, Lotte, Paul, the Tom's: Telf, Wigley, Birmingham and Bailey, Hug, Sam, Ugo, George, Will, Clemi, Dan and John - rest in peace. 
      • September 2010 - my first visit to Denmark for a 3 month placement, where I first experience hyyge, many thanks to Morten for asking me over, the great SEO team of Kristoffer N and Erik, and to Kirstoffer E, Dan, Ann-Sofie, Kasper, Hans Peter, Christian K and P, Karen, Marie-Louise, Sidse and Andreas for making me feel welcome enough that I wanted to come back
      • July 2011 - I go part-time to focus on long term projects on my days off, something which was one of my best life decisions. I work as Social Media and Analytics Manager in the UK.  I move to Denmark again for a 9 month placement, but not before meeting Amy, Eve, Tim, Matt, Charlie, Sian, Charlotte, Mandy, Hollie, Alan, Lyndsey, Jowita, Emmanuel and Peter.  In Denmark great working with Line, Mads, Jens, Hans-Jørgen, Christian, Michaela, Ewa, Martin and Katrine
      • March 2012 - Guava become NetBooster. I join the NetBooster DNA team as an analytics consultant, headed up by Kristoffer Ewald, and with the world-class experts of Christian Pluzek, Dan Pahlen, where I learn lots. Helle, Alun and Mia add to the team that I will miss and made the decision to move very hard to make.  Not forgetting the French team whom I had the pleasure to work with: Thomas, Pierre, Emmanual, Vania and Jerome.

      I'm sorry if you're not mentioned above, if I have worked with you - I haven't mentioned the cool clients and the international offices, but if you see the list above I hope you can see why. 

      Looking back over the list of names I can see that the majority are off on their own various ventures around the world, and I get a sense of pride, like you are members of family out into the world.  I would very much like to hear from anyone, do let me know how you're doing if we haven't spoke in a while.  I'll still be around on SoMe so do get in touch on LinkedIn, Facebook, Twitter or G+ if any of then are your thing.

      My die is cast: I'm off for new adventures; but I hope the people I meet are half as lovely as those I have met.

        ]]>
        Mark Edmondson
        tag:markedmondson.me,2013:Post/657437 2014-02-23T21:13:47Z 2015-02-16T16:38:17Z My Google Analytics Time Series Shiny App (Alpha)

        There are many Google Analytics dashboards like it, but this one is mine:

        My Google Analytics Time Series App

        Its a bare bones framework where I can start to publish publicly some of the R work I have been learning over the past couple of years. 

        It takes advantage of an Alpha of Shinyapps, which is a public offering of R Shiny, that I love and adore. 

        At the moment the app has just been made to authenticate and show some generic output, but I plan to create a lot more interesting plots/graphs from it in the future.

        How To Use It

        1. You need a Google Analytics account.  
        2. Go to https://mark.shinyapps.io/GA_timeseries/
        3. You'll see this screen.  Pardon the over heavy legal disclaimers, I'm just covering my arse.  I have no intention of using this app to mine data, but other's GA apps might, so I would be wary giving access to Google Analytics for other webapps, especially now its possible to add users via the management API.
        4. Click the "GA Authentication" link.  It'll take you to the Google account screen, where you say its ok to use the data (if it is), and copy-paste the token it then displays.
        5. This token allows the app (but not me) process your data.  Go back to the app and paste the token in the box.
        6. Wait about 10 seconds, depending on how many accounts you have in your Google Analytics.
        7. Sometimes you may see "Bad Request" which means the app is bad, and the GA call has errored.  If you hard reload the page (on Firefox this is SHIFT + RELOAD), you need to reauthenticate starting from step 2 above. Sorry.
        8. You should now see a table of your GA Views on the "GA View Table" tab.  You can search and browse the table, and choose the account and profile ID you want to work with via the left hand drop downs. Example using Sanne's Copenhagenish blog:
        9. If you click on "Charts" tab in the middle, you should see some Google Charts of your Visits and PageViews. Just place holders for now.
        10. If you click on the "Forecasts" tab you should see some forecasting of your visits data.  If it doesn't show, make sure the date range to the far left covers 70 days (say 1st Dec 2013 to 20th Feb 2014). 
        11. The Forecast is based on Holt-Winters exponential smoothing to try and model seasonality.  The red line is your actual data, the blue the model's guess including 70 days into the future. The green area is the margin of error to 50% confidence, and the Time axis shows number of months.  To be improved.
        12. Under the forecast model is a decomposition of the visits time series. Top graph is the actual data, second is the trend without seasonal, third graph the 31 data seasonal trend and the forth graph is the random everything else.
        13. In the last "Data Table" tab you can see the top 1000 rows of data.

        That's it for now, but I'll be doing more in the future with some more exciting uses of GA data, including clustering, unsupervised learning, multinomial regression and sexy stuff like that.

        Update 24th Feb

        I've now added a bit of segmentation, with SEO and Referral data available trended, forecasted and decomposed.

        ]]>
        Mark Edmondson
        tag:markedmondson.me,2013:Post/649782 2014-02-03T22:11:20Z 2014-02-04T20:19:03Z Map of the European Migration of Languages

        Just a quick note about this nice language migration map found on reddit lingustics (reddit amazing) at this website

        Imagine a world where the greatest technological achievement is the wheel? Mental.  Its why I love playing Civ 5 so much, getting a tiny sense of those times.

        Early around 1000 BC we have Proto Germanic developing in Denmark, which will eventually invade the British Isles twice - Old Norse and the Northmen who settle in France, Normans.  Any concept of nationality is ridiculous, we are all from everywhere else within a 2000 mile radius.

        Looking at my locales, we can see Brythonic starting around 400 AD, which turned into what we now know as Britons, Cornish and Welsh, which then got over-washed with the Saxons and Old Norse, which then got over-washed with the Normans giving us the mongrel English language today.

        Doesn't help

        I wish this helped a bit with learning Danish, but where I often trip up is where its too close but different - for instance using går ("to walk") interchanged with "go". 

        But there are a lot of Danish grammar rules that are similar to English, something which I guess wouldn't even be close to applying for languages such as Chinese. 

        The most alien Danish grammar wise to be is putting the word "the" at the end of a noun - "huset" meaning "the house" with et = the and hus = house, and even that doesn't apply to Jutland, the closest part of Denmark to Britain.

        Learning another language definitely makes you think more about your own, which is worthwhile. And as they say, each language you learn is like having another soul :D Mine is slowly being built.

        EDIT 4th Feb, 2014

        Alun who comments below has often recommended this Danish Red Book for those English learning Danish, as its written for English people in particular, highlighting where the two languages differ.


        ]]>
        Mark Edmondson
        tag:markedmondson.me,2013:Post/649258 2014-02-02T20:06:42Z 2014-02-02T20:06:43Z Comparison Doesn't Frighten Me - New Music Track by Cem and I

        Here is the first track I'm putting out from my bunker jam project with Cem, ironically called "Comparison Doesn't Frighten Me", as this is blatantly untrue. 

        But it doesn't frighten Richard Feynman or Krishnamurti, who are both sampled in the track extolling their world views.

        Permit me my pretensions on making the track, which perhaps you can read whilst listening:

        I often think music is like a thing evolving - its a thing that can exist as one moment but necessarily takes time to realise.

        French Horns are at the start starting off the Universe, like Tolkein's Ainur, before a electronic saw of hydrogen sears its way through the cosmos. 

        Complexity arises, with the organ: people start being born; losing their innocence after the serpent shows the Tree of Good and Evil; comparing their naked and non-naked selves; educating, killing and creating as lamented by Krishnamurti; then the guitar cuts through with its industry up to a drum explosion and climax; until finally Feynman himself brings it home to the heat-death Omega Point on the last piano note.

        I'll release about 5 more tracks over the next few weeks, which will be at http://soundcloud.com/m-edmondson
        ]]>
        Mark Edmondson
        tag:markedmondson.me,2013:Post/646004 2014-01-25T20:48:00Z 2014-01-28T17:00:05Z My First Adventures with the RaspberryPi

        My darling wife got me a RaspberryPi for Christmas, probably because I couldn't stop going on about it.
        Here's one:

        The first reaction from most people since has been "What does it do?" to which I have struggled to give an answer to...as it can do so much!

        First off, its a hands-on mini computer that can always be on due to low power usage (5V).  This means possibilities such as a webserver for either web or your LAN.

        Second, its educational. It comes with basic Debian linux installed, so its a good way to brush up on those skills which I have found I needed more of recently, especially as Google Cloud Compute also runs off Debian. It also comes installed with Python and Mathematica (its only a question of time before I put R on there too)

        Third it has ok graphics for its size so could function as a media server, serving up films and music.

        Forth it has many sensors you can attach to it, such as IR sensors, Bluetooth iBeacons, cameras or voice activated systems. I was thinking some kind of voice activated gadget system, accessible via a web interface, or hooking it up to TechnicLego and making it part of a robot brain :)

        The units cost about £40 each with accessories, which means with some skills you can replicate more expensive gadgets and have fun trying, and many people once they find a fixed function buy another one to look for more uses. Since I got it I have also bought a 7 Port USB Powered Hub, as the RaspberryPi can't power things such as external hard-drives on its own.

        So far I have put the RaspberryPi next to our Wifi Router so that I can now:

        1. Connect from the web through our building Firewall via reverse SSH tunnelling, by connecting to a Free Tier Amazon linux box and forwarding ports via the always on connection (nefarious applications talked about here)
        2. Setup remote desktop and SSH so I can control the RaspberryPi from my MacBook Air.
        3. Started up an internal LAN homepage, for use in our flat.  I'm hoping my talented web designer wife can make us a web-portal gateway for useful things we may need such as calendars.
        4. Hooked up an external hard drive to create a cheap alternative to Apple's TimeMachine for our MacBooks
        5. Mounted a 32GB USB stick to experiment with network storage using Samba.

        All of which I'd have had no clue about unless I had got the gadget, so this is all WIN for me at the moment :)

        Once these basics are done, I'll consider these next projects:

        • The aforementioned brain for a LegoRobot
        • A timelapse webcam
        • A home greeting system - wave your phone at a sensor, a screen lights up with a personalised homepage
        • A webcrawler gathering data for a specific project
        • Home automation, although will need more gadgets to control...

        Any other ideas?

          ]]>
          Mark Edmondson
          tag:markedmondson.me,2013:Post/645935 2014-01-25T17:45:20Z 2014-01-25T18:04:02Z My Recording Setup In The Bunker

          I have been fortunate to find a place in a studio bunker near Rigshospital in Copenhagen, which is one of the nicest spaces for music I've played in.  Website here, if you are also looking for a music practice room in Copenhagen: http://bedrockmusic.dk/

          A couple of pictures below of in and out:

          They were made during the Cold War, so feature radiation baths, 6 foot of concrete surrounding us and two thick steel doors keeping us in, or people out.  We had a JCB working above us last year, and didn't hear a thing.

          With the excellent and fine drummer Cem we have spent half a day a week ish recording some original tracks, for our own sanity and amusement.  Our sound lies somewhere between naïve rock to prog indie, and we'll probably never pin it down. 

          Between us we play guitar, bass, drumz, 80s keyboards and laptop synths, and usually one of us drafts a song at home and we attempt to record live parts at the Bunker.

          We record using Ableton Live 9, which improves every iteration and is just easy for me, running on my MacBook Air plus an external HD, which copes fine up to about 20 tracks before I need to start freezing tracks.

          This connects to this sexy USB audio interface from Roland, an excellent Audio and MIDI interface

          We usually feed in with two Rode-NT1As


          ..and a ShureSM57 I found in my Dad's shed for the snare.

          I have no amp here in Denmark yet (shipping for my old one was a lot of money) so at the moment I'm DIing straight into the interface, for some good results.  An amp simulator comes with Ableton 9, and I think for live gigs (which we hope to one day) I may even keep that setup if I need to hide my out of practice playing behind space FX :)

          We have around 6 songs nearly done, and an aim for 2014 is to write one song a month.  I'll publish separate posts with some SoundCloud links (which Ableton9 auto uploads to) to some draft songs soon (scary), but feel free to follow me there too: https://soundcloud.com/m-edmondson

          ]]>
          Mark Edmondson
          tag:markedmondson.me,2013:Post/645901 2014-01-25T16:53:40Z 2016-05-26T12:33:41Z SEO Is So Boring

          SEO is so boring, and you think so too which is why you're reading this post.  Let me validate your feelings, with my personal reasons gathered from being involved in the industry 8 years. 

          The main problem is that the SEO blogosphere talks about the same things every two years, with the same conclusions. These are:

          1. Paid Links are evil/good.  Actually, Google wouldn't care either way if its algo could surface content without paid links, but until then they use FUD to make SEOs eat each other.  The newish link disavow tool crowdsources this in a marvellous manner.
          2. A website starting with M and ending with Z will publish a "revolutionary" SEO tactic that will "transform" the industry, to help justify its subscription to its users.  Those user's and other invested interests will post things like "Its fucking amazing!!". Other SEOs will point out that its crap. The publishers are happy just to be talked about whatever.  If they are lucky, Matt Cutts will comment pointing out what they say is indeed, crap.
          3. A Big Brand will be penalised for some SEO tactic.  They will come back again in a fairly short time, much shorter than if it happened to your website, for example. This will be due to them spending lots on AdWords, despite Google public denials. Outrage.  Google penalties are political, deal with it.
          4. SEO is dead.  People confuse an SEO tactic with SEO.  Google discount one method due to spammers taking the piss - see guest blogging, infographics, directories etc. Those SEO's and non-SEO's who relied on that tactic, mostly link building to paper over unoptimised websites, find they have no more ideas, and decry SEO's death.
          5. Rebranding of SEO. Every so often, SEO will have its name changed by industry leaders, to try and disassociate with the above.  There will be discussion on why, how and what anyone cares other than the company trying to own the new keyword space.

          Another major problem is that every SEO blogger/consultant/agency will at some point decide to run a content campaign as "content is good for SEO".  This means a proliferation of half-arsed reheating of SEO content, which range's from paraphrasing Google help files to program manuals with "for SEO" tacked on the end - "Excel2012 for SEO", "Using Twitter for SEO" etc. etc.  or perhaps its just the old standard X number of ways to do Y.  Bite-sized content designed for amateurs, written by the unqualified, since those who have time to maintain a heavy schedule of SEO publishing, don't have enough time to do actual SEO.  The best SEO's I've met hardly had time to tweet once a week. 

          Finally, for a lot of companies that need SEO help, even these days its still the fundamentals that need looking at - title tags, duplicate content etc., which for very large companies can be a nightmare to correct. A lot of SEO opinions on the web work fine if you're running a Wordpress blog, but once it gets to a certain level of SEO its mainly about prioritisation - what things should you concentrate on to get most impact to bottom line revenue? 99% of the time its not going to be some secret SEO tactic, but getting an SEO fundamental correct, and its very rare this prioritisation is talked about - there isn't much more to say.

          Don't be so negative

          Ok. 

          There are some interesting developments fuelled by search engines, mainly Google, for which the SEO industry feeds off of for its food scraps, another source of resentment it seems to some SEO bloggers. 

          SEO for non-Google is interesting.  Yandex and Baidu have different models and philosophies, and optimising for the new searches in say AppStores, LinkedIn or Facebook offers new avenues.  

          Google's move away from top 10 result search page towards its mission to be the Star Trek computer is exciting, and services like Google Now, Google Glass, and semantic technology combining to become the Internet of Things sounds like SEO's will become more like data curators than data manipulators.  

          Likewise the move towards treating SEO holistically as part of a user journey, rather than a last touch channel, holds interest from an analytics viewpoint.

          I don't mean to change anything with this post, and am probably contributing to the problem putting it out there, but at least I will have something to point to in the future when asked about the latest SEO fad.

          ]]>
          Mark Edmondson
          tag:markedmondson.me,2013:Post/645890 2014-01-25T14:05:25Z 2014-07-30T16:35:09Z Copenhagen Gable Murals

          When I first started to cycle around Copenhagen, I noticed a lot of the gable walls held spectacular murals as public art.  In the UK I only saw this occasionally, in the more art-prone cities such as Bristol or Brighton, and I can't recall one at all in Falmouth, probably due to conservation orders.  


          I started snapping away at these murals with the aim of one day doing a collection, which I'll start publishing here, hopefully with a little blurb and map link to where it is. I’ll also replace some of the pictures as I get better shots, but there are also loads I haven't yet got.



          ]]>
          Mark Edmondson
          tag:markedmondson.me,2013:Post/645882 2014-01-25T13:04:09Z 2017-03-05T18:57:43Z My New Blog Home

          I've decided to start up a new blog at markedmondson.me

          It will be about everything I am up to, and I think worth writing down. 

          I'll write it primarily for my friends, family and colleagues in the industry.

          I'm experimenting with merging my business and home-life personae on one platform.  Traditionally I've separated these out, such as my work (@MarkeD_NB) and home (@MarkedAtHome) twitter accounts, but I want a more rounded presentation of myself here.  

          So, my dear readers, some of these posts may not be very interesting, but perhaps some will.  It'll probably cover topics such as:

          • Life in Denmark for an English ex-pat.  I think it best if I practice writing these in Danish, fordi det er god for øve mig.
          • What's it like to work with digital analytics in a European agency?  Its pretty exciting, for me at least.
          • I will shyly share my music endeavours, when I get them to a state they are 90% finished.  (All my songs end up being 90% finished)
          • I'll comment on SEO/Analytics trends, if its not been repeated ad nauseum on other blogs.
          • I'm doing a lot more programming recently, in particular R, Python and JavaScript.  In particular, Machine Learning is a keen interest at the moment, I'm considering entering Kaggle competitions.
          • I'll curate links to web pages around the web that cover my interests, ranging from history to gaming.
          • I'll probably have a gadget section, as I've recently acknowledged my gadget addiction. Most recently this covers the Adventures of the Raspberry Pi.
          • I keep up to date with our boundaries of science, in particular Cosmology and meta-physics.  I'll probably embarrass myself with pseudo-science posts. 
          • I may write some personal for family posts, under password protection.

          I had a blog before running on Posterous, but it got bought by Twitter and then closed down.  This blogging platform, PostHaven, is run by the same creators and holds a lot of the same features I liked in Posterous, but promises to be permanent by charging $5 a month, so will never seek to be bought or hold advertising. 

          This appeals to me - perhaps the content that published here I will be able to read in many years time, and I can marvel at how much, or little, I have changed.  Or, maybe AI robots in the future will be created by distilling a person's social media activity to make us immortal, and this can be the source for FutureMarkAI 3014!

          Below is a picture of me in Cornish Kilt at my brother's wedding.  Just for the record. I don't often wear kilts, not even Cornish ones.






          ]]>
          Mark Edmondson