Child pages
  • Hyak mox Overview
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 15 Next »

High Level Differences from ikt

  1. You only get what you ask for, regardless of the resources available on the node. If you ask for 1 CPU, you'll only get one. If you ask for 1GB of RAM, you'll only get 1GB.
  2. An allocation won't get the same set of nodes all the time, just access to the particular number of nodes to which they're entitled
  3. No occasional preemption in ckpt (formerly bf queue) for the moment
  4. Preempted jobs get 30s to do something smart before being killed and requeued

Connecting

SSH = mox.hyak.uw.edu

BBCP = mox1.hyak.uw.edu

Slurm Primer

Show Queue

All Jobs
squeue
Jobs in Allocation
squeue -p <my short group>
All Jobs in ckpt (was bf)
squeue -p ckpt
Jobs in ckpt from Allocation
squeue -A <my short group>-ckpt

Submit

Own Allocation

sbatch -p <my short group> -A <my short group> test-job.sh

Checkpoint Allocation (formerly bf queue)

sbatch -p ckpt -A <my short group>-ckpt test-job.sh

Interactive Session

Build Allocation

srun -p build --pty /bin/bash

Own Allocation

srun -p <my short group name> --pty /bin/bash

See Node Counts per Allocation

sacctmgr show qos format=name,grptres

Show Job Info

scontrol show job <jobid>

Show Node Info

scontrol show node <node>

Sample Job Script

Change items that are bold and red.

#!/bin/bash
## Job Name
#SBATCH --job-name=test-job
## Resources
## Nodes
#SBATCH --nodes=2   
## Walltime (ten minutes)
#SBATCH --time=10:00
## Memory per node

#SBATCH --mem=2G
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/MYGROUP/MYUSER/MYRUN

module load icc_<version>-impi_<VERSION>
mpirun -bootstrap slurm /gscratch/MYGROUP/MYMODEL/MYMODEL-BIN

  • No labels