Run R, RStudio and OpenCPU on Google Compute Engine [free VM image]

File this under "what I wished was on the web whilst trying to do this myself."

edit 20th November, 2016 - now everything in this post is abstracted away and available in the googleComputeEngineR package - I would say its a lot easier to use that.  Here is a post on getting started with it. http://code.markedmondson.me/launch-rstudio-server-google-cloud-in-two-lines-r/

edit 30th April, 2016: I now have a new post up on how to install RStudio Server on Google Compute Engine using Docker, which is a better way to do it. 

edit 30th Nov, 2015: Oscar explains why some users couldn't use their username

edit 5th October: Added how to login, add users and migrated from gcutil to gcloud

Google Compute Engine is a very scalable and quick alternative to Amazon Web Services, but a bit less evolved in the images available for users. 

If you would like to have a VM with R 3.01, RStudio Server 0.98 and OpenCPU installed, then you can click on the link below, and install a pre-configured version for you to build upon.

With this image, you have a cloud server with the most popular R / Cloud interfaces available, which you can use to apply statistics, machine learning or other R applications on web APIs.  It is a fundamental building block for a lot of my projects.

The VM image is here. [940.39MB]

To use, follow these steps:

Downloading the instance and uploading to your project

  1. Create your own Google Cloud Compute project if you haven't one already.
  2. Put in billing details.  Here are the prices you'll pay for running the machine. Its usually under $10 a month.
  3. Download the image from the link above (and here) and then upload it to your own project's Cloud Storage. Details here
  4. Add the uploaded image to your project with a nice name that is only lowercase, numbers or includes hyphens (-).  Details here. You can do this using gcloud and typing: 
$ gcloud compute images create IMAGE_NAME --source-uri URI

Creating the new Instance

  1. Now go to Google Compute Engine, and select Create New Instance
  2. Select the zone, machine type you want (i.e. you can select a 50GB RAM machine if needed for big jobs temporarily)
  3. In the dropdown for images you should be able to see the image from step 4 above.  Here is a screenshot of how it should look, I called my image "r-studio-opencpu20140628"

Or, if you prefer using command line, you can do the steps above in one command with gcloud like this:

$ gcloud compute instances create INSTANCE [INSTANCE ...] --image IMAGE

Using your instance

You should now have RStudio running on http://your-ip-address/rstudio/ and openCPU running on http://your-ip-address/ocpu/test and a welcome homepage running at the root http://your-ip-address

To login, your Google username is an admin as you created the Google cloud project. See here for adding users to Google Cloud projects

If you don't know your username, try this command using gcloud to see your user details:

$ gcloud auth login

Any users you add to Debian running on the instance will have a user in RStudio - to log into Debian and add new users, see below:

$ ## ssh into the running instance
$ gcloud compute ssh <your-username>@new-instance-name
$ #### It should now tell you that you are logged into your instance #####
$ #### Once logged in, add a user: example with jsmith
$ sudo useradd jsmith
$ sudo passwd jsmith
$ ## give the new user a directory and change ownership to them
$ sudo mkdir /home/jsmith $ sudo chown jsmith:users /home/jsmith

Oscar in the comments below also explains why sometimes your username may not work:

Like other comments, my username did not work.

Rather than creating a new user, you may need to simply add a password to your user account:

$ sudo passwd .

Also, the username will be your email address with the '.' replaced with '_'. So xx.yy@gmail.com became xx_yy

You may also want to remove my default user the image comes with:

$ sudo userdel markedmondson

...and remove my folder:

$ sudo rm -rf /home/markedmondson

The configuration used

If you would like to look before you leap, or prefer to install this yourself, a recipe is below. It largely cobbles together the instructions around the web supplied by these sources:

Many thanks to them.

It covers installation on the Debian Wheezy images available on GCE, with the necessary backports:








36 responses
Thanks for providing this VM image! I installed successfully but it asks for user login, which I don't know. Do you mind share the login info as well?
Dear LittleBoat, that is a good point about the login. RStudio takes the username/password of whoever is a user on Debian, and as an owner of the Google project you should be able to add any users to Debian ( https://cloud.google.com/compute/docs/access ). This needs checking though, so I will add it to the post and if you could check it works for you it would be appreciated!
Hi Mark, Yes, it works! Or we can just ssh into it, and then just type in command "sudo adduser [type in your username]" Thanks!
Great, thanks for checking, I was just typing out a reply with the steps, which I'll add to the post now :)
Thanks for putting the work into the documentation. I'm sure I'll find it useful once I get it up and running. For the moment, I'm stuck on creating an image. I've had to make my bucket through the developers console, and am trying to make an image through the console as well (error otherwise, pasted below). Trying to make an image through the console, I get the message "Required 'read' permission for 'rawDisk.source'". Thoughts? Command line error messsage for image creation: NAME PROJECT ALIAS DEPRECATED STATUS ERROR: (gcloud.compute.images.create) Some requests did not succeed: - Invalid value 'RComputeTest'. Values must match the following regular expression: '(?:(?:[-a-z0-9]{1,63}\.)*(?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?):)?(?:[0-9]{1,19}|(?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?))'
Dear dp, I think it is because you need an image name that is only lowercase and with hyphens. "RComputeTest" fails that. It is a little fussy.
Thank you for the post! Very useful. Everything works fine except for accessing through http://ip-address Fortunately though Google Developers Console lets you access the instance through a browser window.
Update: looking some more into it, it all works fine, Rstudio is up and running through the ip-address link. Great post!
Dear PP glad you got it working. Just in case others have the same issue, the IP should be the external one for your instance as listed by the instructions here: https://cloud.google.com/compute/docs/instances... Did you need to configure the firewall to accept http requests?
Hello, great instruction, thanks a mill. I get to the Rstudio login page, but try as I might I cannot log in. I tried my google username (email) and password, and a couple of other things, but Rstudio just says invalid username / password. Any help advice would be welcome?
Hi AndyC, glad you are nearly there :) Have you tried the steps above using gcloud? Someone above in the comments had similar trouble so I added it to the post a couple of months ago. The username is the one that the Debian uses to install, which defaults to the Google alias used to create the Google Cloud project. If all else fails, if you download the "gcloud" tool, you can type this in terminal: $ gcloud auth login ...and that will list what the names you are authenticated under.
Thanks. Just followed instructions and wanted to let you know that it worked as promised. This was with no prior experience on GCE, so much appreciated.
Great Kripa, thanks for the lovely comment. Drop back and let us know what you do with it if you get the chance.
Thanks a lot Mark, extremely helpful, I have my rstudio up and running!
Rocker rocks! Thanks a million for writing up this post. :-)
Hi Mark! Thanks for the tutorial and the image. I am facing a similar problem to AndyC: Basically when I type my username and password on the RStudio login page I get an error "incorrect or invalid username/password. This is odd as the username should be my email address which I used to create my project and its corresponding password. Furthermore I have type the gcloud auth login and It basically opens my browser and asks me to click on the google account I want to login with. In the terminal I get prompted "You are now logged in as [@gmail.com]. Any idea why RStudio is not recognizing my username and/or password?
Hi Mario, it will not be your email but the name you have associated with your Google account (perhaps "Mario")? Also try the gcloud command to see what your username is saved as, you should have that saved somewhere. ("gcloud auth list" ?). Let me know what works and I'll edit it into the post! -- Reply above this line to create a new comment -- Mario left a new comment on Run R, RStudio and OpenCPU on Google Compute Engine [free VM image]: Hi Mark! Thanks for the tutorial and the image. I am facing a similar problem to AndyC: Basically when I type my username and password on the RStudio login page I get an error "incorrect or invalid username/password. This is odd as the username should be my email address which I used to create my project and its corresponding password. Furthermore I have type the gcloud auth login and It basically opens my browser and asks me to click on the google account I want to login with. In the terminal I get prompted "You are now logged in as [@gmail.com]. Any idea why RStudio is not recognizing my username and/or password? View the post and reply » Change your Posthaven email settings
I have checked using gcloud auth list and whoami within the ssh session and in both cases it appears as mariohinojosa. I have tried that username and the password I use to ssh ( the password associated with my account), unfortunately I still get the error. What is very interesting is the fact that if I create an additional user (apart from the default mariohinojosa and markedmondson) in my instance I am able to log into Rstudio...
Ok thats odd. Perhaps I should add you should make a new user in the how to. But at least you are now able to log in?
yes, I was able to log in using the new user and its corresponding password
nicely done!!
Hi Mark, thanks a bunch for introducing me to the world of cloud computing! I've followed your instructions and got Rstudio running well, but when I try to upload a csv from my computer's hard drive I get this error: "Unexpected response from server" Is it not possible to upload files directly from one's computer? I've also uploaded the csv to a bucket in the cloud, but I couldn't figure out how to link it to Rstudio.
Dear Isak, it is possible to upload files, I do it all the time. But this is a new one for me, I would try looking through RStudio server help, which when I browsed came up with this thread: https://groups.google.com/forum/#!msg/shiny-dis... - before posting in there I'd check to see if its not a very big file, or that perhaps it has non-standard UTF-8 characters in the filename. It sounds like a permissions problem, but out the box the image should have that all set up.
Thanks Mark! I think it has to do with the size. The CSV is about 70 MB, and when I copy parts of the content to a smaller file it does work.
Any idea how I can upload large files? Right now I was able to upload a file of less than 1 mb, but not one that was 4.2 mb
I upload much larger files, so that is pretty strange. Are you uploading a .zip file? Perhaps its a browser issue. But I would ask the question at the link above for the Google group, they will have more an idea.
Thanks, I'll do that. The filetype is not zip but csv or txt, so yes, it's a bit strange!
Great tutorial. There are a few things that could make it easier: 1) Uploading the image was not clear to me. In the Console, navigate to Storage. Create a new 'bucket' and upload the image there. 2) Like other comments, my username did not work. Rather than creating a new user, you may need to simply add a password to your user account $ sudo passwd . For me, its was my email address with the '.' replaced with '_'. So xx.yy@gmail.com became xx_yy 3) You may also want to remove users :) $ sudo userdel markedmondson and remove their folder sudo rm -rf /home/markedmondson 4) Rstudio upload was not working for large files. As far as I can tell, FTP in the only practical option. For me, the SSH key generated using the username from my computer, which was different the one I had previously created. This meant I didnt have permission to ftp data into another user account. Easiest fix was to add a new password (as above) for this new (ftp created) user, so both ftp connection and Rstudio could both have read/write file permissions. 5) The whole point of GCE Rstudio for me this so I could use more than 32gb of Ram my PC had. This is expensive to run and it can take some time to upload data and get R scripts ready to run. To save cost, I suggest making a small instance (e.g. 1 VCPU and 2gb ram, but with sufficient hard drive space), uploading all the data and get it ready. Then take a snapshopt. This snapshot can then me used to create a VM with more resources, e.g. 208gb.
Great comments, thanks Oscar - the username behaviour explains why it was a problem for some but not others. This guide is getting a little out of date and needs a refresh: the UI has been refreshed in the Google Developer Console; and its a bit simpler to use a docker GCE image instead. I can upload quite large files, are you talking about GBs when you talk about uploads? I hope you are not having the same issue as Isak above, where he can't upload 4.2MB - I don't know why that is happening.
No worries. I think the guide is still very helpful. I couldnt find any easier ones to follow (and I tried after my initial frustration with 'buckets' and finding it hard to upload the image you provided. It is clear that the UI has changed because many of Google's help files are now quite difficult to follow. Following your comment I did have a quick look at docker GCE, but could find an easy to follow guide. Not entirely sure about uploads. It certinaly worked for small (
Hello Mark, Thanks so much for your image. I have created a google compute engine instance with it and am finding that i am unable to access R-studio. The only difference is, I used East-Asian zone. I have check to ensure port 8787 is enabled, R is installed. Still when i try to access R-studio using google provided link ("Open browser on custom port" and give 8787), I get the following error message: "We are unable to connect to the VM on port 8787. Learn more about possible causes of this issue.". Can you please help? Thanks much again, Vedha.
Hi Vedha, RStudio should be running at http://your.ip.number/rstudio/ so you shouldn't need the 8787 port. What can you see if you go to the home address? http://your.ip.number ? If nothing then perhaps Apache needs restarting. Or if you do want it on port 8787 without Apache then you can open the 8787 port in the Google Compute engine interface under "Firewalls" Let me know if it works! Mark
Mark, It works now. Thanks. I was trying to use the custom port link to access it. My bad. Thanks for your immediate response. Vedha.
Dear Mark, thank you so much! You have literally saved my day with this post!!! I also tried the Docker article, but the procedure there was quite a lot involved. Would it be possible to make that one simpler? With kind thanks, Jaan
Hi Jaan, glad it helped. I guess this one is simpler as its all been pre-configured but I would recommend the docker one once you're comfortable as you'll have more control :) I'm working from the Docker one myself and yes will try to simplify the guide as I go. But I guess I should keep this alternative up given your comment :)
Thank you very much!