Automating Google Console search analytics data downloads with R and searchConsoleR

Yesterday I published version 0.1 of searchConsoleR, a package that interacts with Google Search Console (formerly Google Webmaster Tools) and in particular its search analytics.

I'm excited about the possibilities with this package, as this new improved data is now available in a way to interact with all the thousands of other R packages.

If you'd like to see searchConsoleR capabilities, I have the package running an interactive demo here (very bare bones, but should demo the data well enough).

The first application I'll talk about in this post is archiving data into a .csv file, but expect more guides to come, in particular combining this data with Google Analytics.

Automatic search analytics data downloads

The 90 day limit still applies to the search analytics data, so one of the first applications should be archiving that data to make year on year, month on month and general development of your SEO rankings.

The below R script:

  1. Downloads and installs the searchConsoleR package if it isn't installed already.
  2. Lets you set some parameters you want to download.
  3. Downloads the data via the search_anaytics function.
  4. Writes it to a csv in the same folder the script is run in.
  5. The .csv file can be opened in Excel or similar.

This should give you nice juicy data.

Considerations

The first time you will need to run the scr_auth() script yourself so you can give the package access, but afterwards it will auto-refresh the authentication each time you run the script.

If you ever need a new user to be authenticated, run scr_auth(new_user=TRUE)

You may want to modify the script so it appends to a file instead, rather than having a daily dump, although I do this with a folder of .csv's to import them all into one R dataframe (which you could export again to one big .csv)

Automation

You can now take the download script and use it in automated batch files, to run daily.

In Windows, this can be done like this (from SO)

  • Open the scheduler: START -> All Programs -> Accessories -> System Tools -> Scheduler
  • Create a new Task
  • under tab Action, create a new action
  • choose Start Program
  • browse to Rscript.exe which should be placed e.g. here:
    "C:\Program Files\R\R-3.2.0\bin\x64\Rscript.exe"
  • input the name of your file in the parameters field
  • input the path where the script is to be found in the Start in field
  • go to the Triggers tab
  • create new trigger
  • choose that task should be done each day, month, ... repeated several times, or whatever you like

In Linux, you can probably work it out yourself :)

Conclusion

Hopefully this shows how with a few lines of R you can get access to this data set.  I'll be doing more posts in the future using this package, so if you have any feedback let me know and I may be able to post about it.  If you find any bugs or features you would like, please also report an issue on the searchConsoleR issues page on Github.

23 responses
thanks a lot for developing searchConsoleR! maybe I'm missing something: if I run a query with all dimensions available ('date','country','device','page','query') in the last 3 days it works fine and quick. But if I extend the date range to 4 or more days I get this error: "Error in lookupCountryCode(dimensionCols$country) : country.code not in country.codes." Thanks in advance!
Dear Federico, thanks a lot for trying the library :) That sounds like a bug, but could you put it on the GitHub issue link with an example of your code (you can leave off the website URL), then I can try to reproduce and fix if needed! The Github issue tracker is here: https://github.com/MarkEdmondson1234/searchCons...
Also, the error message should say something like: "country.code not in country.codes. Got: BLAH". If you could tell me what BLAH is (if anything) that would help.
yes Mark, your're right! this is BLAH: Got: ITAITAITAITAGBRRUSITARUSITADEUITADEUITAITAITARUSITACHEITARUSITAUKRITARUSRUSUKRITAITAUKRRUSITARUSCXXUKRITADEUITA
Thanks Frederico, as mentioned on Github this was a bug involving strange countries being returned in the API, but the Github version is now more tolerant and won't error: https://github.com/MarkEdmondson1234/searchCons... This will be in the next release to CRAN in a couple of weeks.
Hey Mark, awesome job creating this package. Thanks very much for this. I'll become a daily user of this package as it really solves a need that I had. I had even planned to do some web scrapping of the Google Search Console, but now is solved. Thanks again!
Thanks Luis! An update will be coming out soon too, which vastly speeds up big downloads if you need it.
Hallo, thanks for the Script but i got this error and i dont know how to fix it JSON fetch error: User does not have sufficient permission for site 'https://www.domain.com/'. See also: https://support.google.com/webmasters/answer/24.... In addition: Warning messages: 1: In retryRequest(do.call(request_type, args = arg_list, envir = asNamespace("httr"))) : Request Status Code: 403 2: In retryRequest(do.call(request_type, args = arg_list, envir = asNamespace("httr"))) : JSON fetch error: User does not have sufficient permission for site 'https://www.domain.com/'. See also: https://support.google.com/webmasters/answer/24.... 3: In retryRequest(do.call(request_type, args = arg_list, envir = asNamespace("httr"))) : No retry attempted: User does not have sufficient permission for site 'https://www.domain.com/'. See also: https://support.google.com/webmasters/answer/24.... the user has the full premission and i created the api as it here mentioned http://www.ryanpraski.com/google-search-console... can u please tell me how can i fix it?
Hi Mike, I only get that error when I have a typo in my domain name, I would triple check you have the right domain, including if its www or non-www and http vs https, and of course check the account you are using to authenticate with has access - its the account you used to generate the JSON file or OAuth2 file.
Hallo Mark, Thanks for the fast Replay .. i fixed the problem by adding the client_email from the JSON file to the authorized list in webmaster with full premission. the email is google@xxxxx.iam.gserviceaccount.com but the csv file has another problem datas arent clear for example "date","query","clicks","impressions","ctr","position" 2016-03-12," ",6,46,0.130434782608696,4.30434782608696 2016-03-12,"site name",2,3,0.666666666666667,1 2016-03-12," ",2,3,0.666666666666667,1 how can i fix this problem !
i forgot to mention that the searching words are in arabic languge .. is it possible that the arabic languge is not supported ? and if yes how could i fix the problem ! and the last Quetion how could i edit the code to determine the start and the end date .. i want it for example from than a week .. is it possible to do this ? thanks in advance
Hi Mike, it's better to ask for help on GitHub as its easier to keep track, at http://www.github.com/MarkEdmondson1234/searchC... I don't understand what the problem is in your first comment, but the API supports Arabic no problem, I use it myself for that. I suspect it may be what you are opening the .csv with, for example Excel it needs to support that encoding.
Hello and thanks for your package , i have a question about the utility of this package, if i have undesatnd it let you to conenct to your search google console and fetch iSEO information of your website but we can do this directly on conencting with google search console and download directly the data as a csv file. please can you explain more the utility for this package? thanks in advance for sharing your knowldge
Hi Seldata, well it lets you download the data programmatically without logging in manually, so I use it on an automated nightly script to keep an archive, for example. Its a massive time-saver, and lets you do things that would be too much hassle normally. It also is more reproducible, as manual exports could not have the exact same filter or date range etc. and you wouldn't know if the data is different, whereas in a script you can see exactly what you are getting.
Hi Mark and thanks for your reply, i will explian you more what i need to know , i work as analyst , and i look how can i help the SEO group in in their work by using your package , if i have undestand your response, this package it make an automatic download csv and cumulate information with the same metric . What do you think? i create a shiny application ? any proposition welcome , and thank you very much for your time
Hi Seldata, Sounds like a good start, although the script can run on its own server and add it to your own database (I use BigQuery), then perhaps a Shiny app so your end users can see the archived data easily, with a UI similar to the Search Console web interface.
Hello Mark, thanks for your reply you are helpful , in one hand ,i have undestand that i can ran the R script daily with some metric and sortage the cumulative data in a my database . in the anathor hand i'm new to shiny , in your interactive demo here (consoleR) , there is no user interface , is this diffucult to programm ? thank you very much for effort and for sharing your knowldge. best regards
Hi Seldata, I think Shiny is more for advanced R users, I would start with offline analysis first, then go through the tutorials at http://shiny.rstudio.com/ when you are more comfortable with R in general to make the UI. Perhaps RMarkdown would be a good step before Shiny, you can make a nice UI without interactivity http://rmarkdown.rstudio.com/
Hi Mark, Really appreciate this post. Always great to see R projects that are applicable to SEO. Storing the GSC data in CSVs completely makes sense. Piggy-backing off of what Seldata said above, just checking to see if at any point you might have published instructions for building the Shiny app above and how it connects to what I assume is BigQuery database that you mentioned above. I know that's a lot to ask, but thought I'd check in order to get a head start! Thanks! Jeff
Hi Jeff, thanks for commenting :) While not specifically about search console data, this post covers some of the steps needed, including uploading data to BigQuery http://code.markedmondson.me/digital-analytics-...
Thanks Mark -- the Shiny theme in this example looks terrific. Appreciate the direction.
2 visitors upvoted this post.