Child pages
  • Mox_scheduler

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

This article is for both mox.hyak (hyak nextgen) and for ikt.hyak (hyak classic).

(For historical reasons, the title of this page is Mox_scheduler.)


Mox and ikt use Mox uses a scheduler called slurm.

Below xyz is your hyak group name and abc is your UW netid.

Find out from your group members, whether you should To logon to mox or to ikt. Some groups have nodes on both mox and ikt. Some groups have nodes only on ikt or only on mox.To logon mox.hyak:

ssh abc@mox.hyak.uw.edu

To logon ikt:

ssh abc@ikt.hyak.uw.edu

The above command gives you access to the login node of mox or ikt.hyak. The login node is only for logging in and submitting jobs. The computational work is  done on a compute node. As shown below, you can get either an interactive compute node or submit a batch job. The build node is a special compute node which can connect to the internet.

...

The build node can connect to outside mox. It is useful for using git, transferring files to outside mox or getting files from outside mox, installing packages in R or Python etc.

To get an interactive build node for 2 hours:

For mox.hyak:

srun -p build --time=2:00:00 --mem=20G --pty /bin/bash

For ikt.hyak:

srun -p build --time=2:00:00 --mem=10G --pty /bin/bash


(Note: (a) --pty /bin/bash must be the last option in above command.

           (b) If you do not get a build node with above values of --mem, then try smaller values.

           (c) You can connect to the internet from the build node.)


To get an interactive node with 4 cores in your own group for 2 hours on mox:

srun -p xyz -A xyz --nodes=1 --ntasks-per-node=4 --time=2:00:00 --mem=100G --pty /bin/bash

Note that if you are using an interactive node to run a parallel application such as Python multiprocessing, MPI, OpenMP etc. then the number in the --ntasks-per-node option must match the number of processes used by your application. For example for the above srun command:

(a) For Python multiprocessing use code like "p=multiprocessing.Pool(4)".

(b) For MPI use, "mpirun -np 4 myprogram"

(c) For OpenMP use "export omp_num_threads=4"

(d) For GNU parallel use  "-j 4" option

When you are using the build node with "-p build", then you do not need to give the -A option.

Also if your group has an interactive node then use the  xyz-int for the -p option. For example:

srun -p stf-int -A stf --time=1:00:00 --mem=10G --pty /bin/bash

Note that an interactive node in your own group cannot connect to outside mox or ikt.

Specifying memory:

           (1) It is important to use the --mem option to specify the required memory. If the memory is not specified then the the SLURM scheduler limits the usage of memory to

...

              For the knl nodes, use --mem=200G

An interactive node in your own group cannot connect to outside mox or ikt.

To get an interactive node in your own group for 2 hours on mox:

srun -p xyz -A xyz --time=2:00:00 --mem=100G --pty /bin/bash

For ikt, use --mem=58G.

On mox and ikt, the -p and -A options should be the same. (Except when you are using the build node with "-p build", then you do not need to give the -A option.)


SLURM environment variables:

Issue below comand at an interactive node prompt to find the list of SLURM environment variables:

...

If you are setting up an application which uses multiple nodes (e.g. Apache Spark), you will need interactive access to multiple nodes .

To get 2 nodes with 28 cores per node for interactive use:

srun -N -nodes 2 --ntasks-per-node 28 -p xyz -A xyz  --time=2:00:00 --mem=100G --pty /bin/bash

...

See below link for using the mox and ikt ckpt queue:

Mox_checkpoint

...

Submit a batch job from mox login node:

sbatch -p xyz -A xyz myscript.slurm

The script myscript.slurm is similar to myscript.pbs used in hyak classic. Below is an example slurm script. The --mem option is for memory per node. Replace the last line "myprogram" with the commands to run your program (e.g. load modules, copy input files, run your program, copy output files etc.).

#!/bin/bash

## Job Name

#SBATCH --job-name=myjob

## Allocation Definition

## On mox and ikt, the The account and partition options should be the same except in a few cases (e.g. ckpt queue and genpool queue).

#SBATCH --account=xyz
#SBATCH --partition=xyz

## Resources

## Total number of Nodes

#SBATCH --nodes=1   

## Number of cores per node

#SBATCH --ntasks-per-node=28

## Walltime (3 hours). Do not specify a walltime substantially more than your job needs.

...

## Specify the working directory for this job

#SBATCH --workdirchdir=/gscratch/xyz/abc/myjobdir

...

Batch usage Multiple Nodes:

If you use multple multiple nodes in one batch job, then your program should know how to use all the nodes. For example, your program should be a MPI program.

...

The value of the --ntasks-per-node option should be 16 for ikt and 28 for mox. (Do not increase the values. However, you can decrease them if your progam is running out of memory on a node.)no greater than 40 as it represents the maximum number of cores per node you are requesting and no node type has more than this. However, the nodes you are able to access may not have 40 cores so your job could pend indefinitely. Please check which resources are available to you and their limits in determining what you can or should request.

For ikt:

#SBATCH --nodes=4
#SBATCH --ntasks-per-node=16
For mox:
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=28
# OR

#SBATCH --ntasks-per-node=40



Self-limiting your number of running jobs:

...

Below are some common error messages from slurm:. You may see other errors in slurm .out and .err files.

(1) If you use "--mem 120G" and your program uses more memory than 120G then your slurm job will end and you will get below error in the slurm output file. Slurm output files have names like slurm-348658.out.
slurmstepd: error: Exceeded job memory limit

...

When the above command runs, then you will have been allocated 2 nodes but will still be on the mox login node.

If you issue a command like below then srun will run the command hostname on each node:

...