Running R and Data Science in the Cloud Part III
The previous two posts focused on getting a server up and running, and setting up your server with user access. In this post, we finally work on installing R and RStudio Server on our instance and configuring the application. Pay special attention to the packages notes at the end of this post. This will ensure a significantly decreased number of headaches. Let’s go!
Installing R on the server
Linux already has many predefined repositories and we could install R directly from the repositories that digital ocean kindly sets up for us. However, many times, these repositories will be out of date, leading you to install an outdated version of R. To circumvent that, we’ll modify one of the config files on our server and add a repository listed here. To do that, we’ll need to run the following command in our console (assuming we are already connected to our server).
sudo nano /etc/apt/sources.list
The command uses the “nano” editor to edit a text file called sources.list which lists all of the repositories being used by our linux server. The sudo command ensures that we run it as a super user (sources.list is not available for regular users). In the sources.list file, use the up/down arrow keys to navigate to the end of the file, and include a line similar to the one below (I’m using UCLA CRAN because it’s closest to my physical location. Ideally, you’d want to use the location closest to your server location).
deb http://cran.stat.ucla.edu/bin/linux/ubuntu trusty/
Next up, we run a quick update on our system to ensure that we are up to date:
sudo apt-get update
Lastly, we install R-BASE:
sudo apt-get install r-base
To test whether or not R is up and running on our system, just type R at the command line, and hit enter. R should load as command prompt.
Installing RStudio Server
To download and install RStudio Server on our instance, we’ll need to first download the file using wget, then install the software using gdebi. Run the following commands sequentially, pressing the return key after each one:
sudo apt-get install gdebi-core
sudo apt-get install libapparmor1
sudo gdebi rstudio-server-0.98.1103-amd64.deb
RStudio Server should now be installed on your server. To access the web interface, simply point your browser to http://IPADDRESS:8787. In my case, it would be http://22.214.171.124:8787. You should be greeted with a shiny login screen. Your credentials will be the same username/password that you used to log into your server.
A Note About Packages
When installing packages on RStudio Server, you have two options: install them as yourself (in my case, when logged into the user bogdan), or install them as root. The main difference has to do with who can access which package:
- Packages installed using your personal username are ONLY ACCESSIBLE by you. They are installed in a user-specific library.
- Packages installed using the root user are accessible by EVERYONE in the system. This means that any past or future accounts will have access to these packages.
To install packages using the root username, DO NOT use RStudio to do so. Please log into your server using command prompt, then run the R program, and use the command prompt to install any packages you might need.