In order to effectively use "R" on the Hyak cluster, a bit of setup first is recommended.
First, get yourself connected to a Hyak Login node:
then get a build node (this example will auto-logout after 2 hrs)
you should see a prompt that looks something like this:
you are now on a build node. Build nodes are special because they are just like "worker nodes" except they have access to the internet, and can be used to download and install "R" packages interactively on your home directory's personal "R" library area. This is how you get the packages for R that you need to run your simulations/jobs installed.
Once installed in your personal "R" library, all subsequent runs on the cluster have these R packages in their R library path.
You can RUN "R" on a worker node, and use libraries that you installed from the build node, but on a worker node, you cannot reach the internet, so therefore you cannot install any new R packages on a worker node. For an overview of what Hyak Nodes do, see HERE (Add link later)
ok - so we are on a "vanilla" build node - there is are no software modules activated at this point. If you try to run "R" now you will get an error like this:
uwnetid@n#### ~$ R
-bash: R: command not found
this is expected. You havent loaded any software modules yet. Take a look at what software you can load using the command:
heres an abbreviated list of what you will see (as of 6/1/2016)
Lets load the "r_3.2.4" (latest available) package now:
now run "R"
now install the packages you want.
At this point, there have been hyak misconfiguration issues that in the past have lead to weird "unsupported URL scheme" errors - if this happens, send email to "firstname.lastname@example.org" with HYAK: R package install errors on build node" and paste the session history in to your support email.
if you dont see such errors, you should see instead a list of Repos, choose the closest (usually the USA (WA) https repo at Fred Hutch is best)
HTTPS CRAN mirror
(This selects the "http" version of the WA state CRAN mirror (at Fred Hutch)
at this point, the "Matrix" package should download and install.
If it does not, you probably have a problem with curl/https on centos 6.x.
and verify that it contains a call to the /sw/local/etc/Rprofile.site file there:
and then look at the contents of /sw/local/etc/Rprofile.site :
the wget depreciated option should be set on Centos 6.x where the version of libCURL (which R tries to use by default) is too old to work for any https sites
An Older section follows below - covers how to install openMPI on R for Hyak use
If you want to use "R" with the "Rmpi" package on the Hyak cluster, there are a few steps you will need to take.
- Log in to Hyak (ssh email@example.com) - you'll need your Entrust Token (AKA "PRN")
- Request a "build node" (the example below gives you 2 hrs to use a node for a build)
when this completes, you will get a prompt something like this:
- on the build node, load the modules you need with the commands:
module load icc_14.0.3-ompi_1.8.1
This gets you:
the OpenMPI 1.8.1 package
the Intel "C" compiler version 14.0.3
the vanilla, open source "R" package version 3.1.1
- Now we need to get the source package for the Rmpi R library direct from CRAN:
Note: this will fail if you are not on a node with internet access (build node or login node)
- Last step, run the compile manually at the command line:
OLD VERSION / STUFF TO CLEAN UP LATER:
here's what worked for CSDE in terms of compiling Rmpi:
We'll use the OpenMPI 1.8.1 package, with the Intel "C" compiler version 14.0.3
And the vanilla, open source "R" package version 3.1.1
All Hyak cluster nodes have ZERO software set up in the PATH by default - you've got to load
in the software you want to use in the form of modules... there are lots of modules available
for all kinds of software/tools/compilers/etc - this is how Hyak keeps things reproducible - new
versions of software get added as new modules - without breaking the old modules that your
scripts may be relying on. So as new modules/versions of tools appear, your old code and scripts
should still be usable and results reproducible. See "module avail" at the cmd line, or
other sections of this wiki for the list of software available to you.
Notes: How to Get/Configure/Compile the "Rmpi" OpenMPI library for R on Hyak
module load icc_14.0.3-ompi_1.8.1
Currently Loaded Modulefiles:
Use "module avail" to see all the modules available to load. You can email
to firstname.lastname@example.org with "HYAK" in the subject line to request new modules for new versions
of R and other packages, the admins dont automatically make these for every new version.
- Get the source package for the Rmpi R library direct from CRAN:
##### This is the magical syntax to be able to compile it:
Now you have a compiled Rmpi package in your personal "R" library on Hyak.
For me, this is in: /home/mbw/R/x86_64-unknown-linux-gnu-library/3.1
Some MPI and R links:
.... please add your fav's here
Some general, getting started Notes about Hyak:
If you have never submitted a job before to HYAK, I found it EXTREMELY
useful to go through the example program here:
(yes, this is pure MPI, not "R" MPI, but it is a good getting started exercise)
Create a directory in the gscratch area for Hyak, put the code in there,
Compile the simple MPI program, make the submit script from the example, then
You will learn a lot of things, including, but not limited to:
- the batch submission may take more than 60 seconds to show up in the queue
if you remove "#" symbols from the PBS lines (thinking it is a script with examples to un-comment)
- If you ask for more nodes than your group has, your job will never run
with OpenMPI - if you ask for 32 CPU's, your job may not run, because it needs
Your files - managing files and file locations on Hyak: