Child pages
  • CSDE notes about using R on Hyak - Open MPI, building packages, repository issues
Skip to end of metadata
Go to start of metadata

In order to effectively use "R" on the Hyak cluster, a bit of setup first is recommended.

First, get yourself connected to a Hyak Login node:

   ssh uwnetid@hyak.washington.edu

then get a build node (this example will auto-logout after 2 hrs)

   qsub -q build -I -l walltime=2:00:00

you should see a prompt that looks something like this:

 [uwnetid@n#### ~]$

you are now on a build node. Build nodes are special because they are just like "worker nodes" except they have access to the internet, and can be used to download and install "R" packages interactively on your home directory's personal "R" library area. This is how you get the packages for R that you need to run your simulations/jobs installed.
Once installed in your personal "R" library, all subsequent runs on the cluster have these R packages in their R library path.

You can RUN "R" on a worker node, and use libraries that you installed from the build node, but on a worker node, you cannot reach the internet, so therefore you cannot install any new R packages on a worker node. For an overview of what Hyak Nodes do, see HERE (Add link later)

ok - so we are on a "vanilla" build node - there is are no software modules activated at this point. If you try to run "R" now you will get an error like this:
uwnetid@n#### ~$ R
-bash: R: command not found

this is expected. You havent loaded any software modules yet. Take a look at what software you can load using the command:

   modules avail

heres an abbreviated list of what you will see (as of 6/1/2016)

    gnuplot_5.0.1                                paraview_4.3.1
    hdf5_1.8.12-icc_14.0.2                       r_3.2.2
    hdf5_1.8.13-gcc_4.4.7                        r_3.2.4
    hdf5_1.8.16-icc_16.0.2                       revolutionr_7.4.1

Lets load the "r_3.2.4" (latest available) package now:

    module load r_3.2.4

now run "R"

 

    uwnetid@n#### ~$ R
    R version 3.2.4 (2016-03-10) -- "Very Secure Dishes"
    Copyright (C) 2016 The R Foundation for Statistical Computing
    Platform: x86_64-pc-linux-gnu (64-bit)
    
    R is free software and comes with ABSOLUTELY NO WARRANTY.
    You are welcome to redistribute it under certain conditions.
    Type 'license()' or 'licence()' for distribution details.
    
      Natural language support but running in an English locale
    
    R is a collaborative project with many contributors.
    Type 'contributors()' for more information and
    'citation()' on how to cite R or R packages in publications.
    
    Type 'demo()' for some demos, 'help()' for on-line help, or
    'help.start()' for an HTML browser interface to help.
    Type 'q()' to quit R.
    
    > 

now install the packages you want.

    install.packages("Matrix")

At this point, there have been hyak misconfiguration issues that in the past have lead to weird "unsupported URL scheme" errors - if this happens, send email to "help@uw.edu" with HYAK: R package install errors on build node" and paste the session history in to your support email.

if you dont see such errors, you should see instead a list of Repos, choose the closest (usually the USA (WA) https repo at Fred Hutch is best)

HTTPS CRAN mirror

  1: 0-Cloud [https]                2: Austria [https]             
  3: Chile [https]                  4: China (Beijing 4) [https]   
  5: Colombia (Cali) [https]        6: France (Lyon 2) [https]     
  7: France (Paris 2) [https]       8: Germany (Münster) [https]   
  9: Iceland [https]               10: Mexico (Mexico City) [https]
 11: Russia (Moscow) [https]       12: Spain (A Coruña) [https]    
 13: Switzerland [https]           14: UK (Bristol) [https]        
 15: UK (Cambridge) [https]        16: USA (CA 1) [https]          
 17: USA (KS) [https]              18: USA (MI 1) [https]          
 19: USA (TN) [https]              20: USA (TX) [https]            
 21: USA (WA) [https]              22: (HTTP mirrors)              
 Selection: 21

(This selects the "http" version of the WA state CRAN mirror (at Fred Hutch)
URL: 'https://cran.fhcrc.org/src/contrib'

at this point, the "Matrix" package should download and install.
If it does not, you probably have a problem with curl/https on centos 6.x.

Look at:

     module show r_3.2.4

and verify that it contains a call to the /sw/local/etc/Rprofile.site file there:

  [mbw@n0026 ~]$ module show r_3.2.4
  -------------------------------------------------------------------
  /sw/Modules/modulefiles/r_3.2.4:   
  
  module-whatis    Adds R to the PATH. 
  module           load icc_14.0.3 
  prepend-path     PATH /sw/r-3.2.4/bin 
  setenv           R_PROFILE /sw/local/etc/Rprofile.site 
  -------------------------------------------------------------------

and then look at the contents of /sw/local/etc/Rprofile.site :

     [uwnetid@n#### ~]$ more /sw/local/etc/Rprofile.site 
     options(repos=structure(c(CRAN="https://cran.fhcrc.org/")))
     options(download.file.method = "wget")
     [uwnetid@n#### ~]$ 

the wget depreciated option should be set on Centos 6.x where the version of libCURL (which R tries to use by default) is too old to work for any https sites

--Mbw (talk) 09:40, 1 June 2016 (PDT)


An Older section follows below - covers how to install openMPI on R for Hyak use


If you want to use "R" with the "Rmpi" package on the Hyak cluster, there are a few steps you will need to take.

  • Log in to Hyak (ssh uwnetid@hyak.washington.edu) - you'll need your Entrust Token (AKA "PRN")
  • Request a "build node" (the example below gives you 2 hrs to use a node for a build)
        qsub -q build -I -l walltime=2:00:00

when this completes, you will get a prompt something like this:

        [mbw@n0026 ~]$ 
  • on the build node, load the modules you need with the commands:
        module load r_3.1.1

module load icc_14.0.3-ompi_1.8.1

This gets you:
the OpenMPI 1.8.1 package
the Intel "C" compiler version 14.0.3
the vanilla, open source "R" package version 3.1.1

  • Now we need to get the source package for the Rmpi R library direct from CRAN:
        wget http://cran.r-project.org/src/contrib/Rmpi_0.6-5.tar.gz

Note: this will fail if you are not on a node with internet access (build node or login node)

  • Last step, run the compile manually at the command line:
         R CMD INSTALL Rmpi_0.6-5.tar.gz --configure-args="--with-Rmpi-include=/sw/openmpi-1.8.1_icc-14.0.3/include \
               --with-Rmpi-libpath=/sw/openmpi-1.8.1_icc-14.0.3/lib --with-Rmpi-type=OPENMPI" --no-test-load

 

OLD VERSION / STUFF TO CLEAN UP LATER:

here's what worked for CSDE in terms of compiling Rmpi:

We'll use the OpenMPI 1.8.1 package, with the Intel "C" compiler version 14.0.3
And the vanilla, open source "R" package version 3.1.1

All Hyak cluster nodes have ZERO software set up in the PATH by default - you've got to load
in the software you want to use in the form of modules... there are lots of modules available
for all kinds of software/tools/compilers/etc - this is how Hyak keeps things reproducible - new
versions of software get added as new modules - without breaking the old modules that your
scripts may be relying on. So as new modules/versions of tools appear, your old code and scripts
should still be usable and results reproducible. See "module avail" at the cmd line, or
other sections of this wiki for the list of software available to you.

Notes: How to Get/Configure/Compile the "Rmpi" OpenMPI library for R on Hyak

  module load r_3.1.1

module load icc_14.0.3-ompi_1.8.1

  [mbw@login3 ~]$ module list

Currently Loaded Modulefiles:

  1) modules                 2) icc_14.0.3              3) r_3.1.1                 4) icc_14.0.3-ompi_1.8.1

  [mbw@login3 ~]$

Note:
Use "module avail" to see all the modules available to load. You can email
to help@uw.edu with "HYAK" in the subject line to request new modules for new versions
of R and other packages, the admins dont automatically make these for every new version.

[mbw@login3 ~]$
[mbw@login3 ~]$ pwd
/usr/lusers/mbw
      1. Get the source package for the Rmpi R library direct from CRAN:
 [mbw@login3 ~]$ wget http://cran.r-project.org/src/contrib/Rmpi_0.6-5.tar.gz
 --2014-09-18 15:09:25--  http://cran.r-project.org/src/contrib/Rmpi_0.6-5.tar.gz
 Resolving cran.r-project.org... 137.208.57.37
 Connecting to cran.r-project.org|137.208.57.37|:80... connected.
 HTTP request sent, awaiting response... 200 OK
 Length: 102182 (100K) [application/x-gzip]
 Saving to: “Rmpi_0.6-5.tar.gz†

 100%[=========================================================================================================================================>] 102,182      146K/s   in 0.7s

2014-09-18 15:09:26 (146 KB/s) - “Rmpi_0.6-5.tar.gz†saved [102182/102182]

#####
##### This is the magical syntax to be able to compile it:
######

mbw@login3 ~]$ R CMD INSTALL Rmpi_0.6-5.tar.gz --configure-args="--with-Rmpi-include=/sw/openmpi-1.8.1_icc-14.0.3/include --with-Rmpi-libpath=/sw/openmpi-1.8.1_icc-14.0.3/lib --with-Rmpi-type=OPENMPI" --no-test-load
 installing to library ‘/home/mbw/R/x86_64-unknown-linux-gnu-library/3.1’
 installing *source* package ‘Rmpi’ ...
 package ‘Rmpi’ successfully unpacked and MD5 sums checked
checking for openpty in -lutil... yes
checking for main in -lpthread... yes
configure: creating ./config.status
config.status: creating src/Makevars
 libs
icc -std=gnu99 -I/sw/r-3.1.1/lib64/R/include -DNDEBUG -DPACKAGE_NAME=\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -I/sw/openmpi-1.8.1_icc-14.0.3/include  -DMPI2 -DOPENMPI -I/usr/local/include    -fpic  -O2 -xHost  -c Rmpi.c -o Rmpi.o 
icc -std=gnu99 -I/sw/r-3.1.1/lib64/R/include -DNDEBUG -DPACKAGE_NAME=\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -I/sw/openmpi-1.8.1_icc-14.0.3/include  -DMPI2 -DOPENMPI -I/usr/local/include    -fpic  -O2 -xHost  -c conversion.c -o conversion.o
icc -std=gnu99 -I/sw/r-3.1.1/lib64/R/include -DNDEBUG -DPACKAGE_NAME=\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -I/sw/openmpi-1.8.1_icc-14.0.3/include  -DMPI2 -DOPENMPI -I/usr/local/include    -fpic  -O2 -xHost  -c internal.c -o internal.o
icc -std=gnu99 -shared -L/usr/local/lib64 -o Rmpi.so Rmpi.o conversion.o internal.o -L/sw/openmpi-1.8.1_icc-14.0.3/lib -lmpi -lutil -lpthread
installing to /home/mbw/R/x86_64-unknown-linux-gnu-library/3.1/Rmpi/libs
** R
** demo
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
* DONE (Rmpi)
[mbw@login3 ~]$

Now you have a compiled Rmpi package in your personal "R" library on Hyak.

For me, this is in: /home/mbw/R/x86_64-unknown-linux-gnu-library/3.1

Some MPI and R links:

    http://math.acadiau.ca/ACMMaC/Rmpi/sample.html

.... please add your fav's here

Some general, getting started Notes about Hyak:

If you have never submitted a job before to HYAK, I found it EXTREMELY
useful to go through the example program here:

     Hyak_Open_MPI

(yes, this is pure MPI, not "R" MPI, but it is a good getting started exercise)

Create a directory in the gscratch area for Hyak, put the code in there,
Compile the simple MPI program, make the submit script from the example, then
submit it.

You will learn a lot of things, including, but not limited to:

  • the batch submission may take more than 60 seconds to show up in the queue
  • if you remove "#" symbols from the PBS lines (thinking it is a script with examples to un-comment)

      you will get weird errors that are hard to understand and your job submit will fail.
    
  • If you ask for more nodes than your group has, your job will never run
  • with OpenMPI - if you ask for 32 CPU's, your job may not run, because it needs

      33 CPUs when you include the head/master process
    

Your files - managing files and file locations on Hyak:

Managing your Files