Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Warning
titleIkt Hyak Cluster Decommissioning

The older Ikt cluster has been decommissioned.

Table of Contents
outlinetrue

News

Mailing List

Sign up for the Hyak-users mailing list to stay up to date with the latest news: https://mailman1.u.washington.edu/mailman/listinfo/hyak-users

This list is used for ALL announcements to Hyak users, so please sign up. Important Hyak announcements are ONLY sent to this list, not to the individual Hyak accounts.

Announcements

September 1, 2016

The Hyak User Wiki has moved to a new wiki service. All the formatting may not be 100%, but we do hope to dedicate resources to reorganizing the wiki, both content and formatting, in the coming months.

January 14, 2016

You can now get to this top level page using the address: http://wiki.hyak.uw.edu

System Status

Current Status

August 9, 2016 1:39pm

Hyak is operating normally.

Monthly Scheduled Maintenance

System Status

Hyak consists of two independent clusters: ikt.hyak (hyak classic) and mox.hyak (hyak next-gen).

Current Status

Tip

mox.Hyak is online.*

*A maintenance reservation is in place for scheduled maintenance on September 8th, 2020.

*When a maintenance reservation is in place, jobs whose time limit (sbatch --time parameter) could allow the job to extend into the maintenance period will not be allowed to start by the scheduler.  The squeue command will show the reason as 'ReqNodeNotAvail'.

Hyak will be offline from 9:00 am - 5:00 pm for scheduled maintenance the second Tuesday of every month. Every third month (JanFeb, AprMay, JulAug, OctNov), the maintenance window will last from 9:00 am to 9:00 the following morning. More on the Maintenance schedule.

Other maintenance performed on Hyak is relatively rare, short (30 - 60 minutes), and does not impact running jobs. This type of maintenance is announced to the hyak-users list.

Cluster Node Status

We run the Ganglia cluster status tool. Here you can find out the details about the CPU, memory, and network usage of the nodes in your allocation.

http://status.hyak.uw.edu/

Hyak Overview

Hyak is a service of UW-IT. You can read more about it in the Service Catalog.

...


Hyak Overview

Hyak is a service of UW-IT. You can read more about it at this link https://itconnect.uw.edu/service/shared-scalable-compute-cluster-for-research-hyak/

This link describes how to get an Hyak account: Requesting an Account

This link has information about of some the research groups which use Hyak: Hyak_Research

Here is a list of some of the research papers which have used Hyak: Hyak Publications

Most of the nodes are devoted for computing but the cluster also includes a few "login-nodes", dedicated to logins, file transfers, and similar communication tasks. There are also nodes that belong to the sponsoring institutions, and nodes for compiling and testing the programs.

Any UW student can get access to hyak by joining the RC club: https://depts.washington.edu/uwrcc/getting-started-2/getting-started/

RC Club office hours: https://depts.washington.edu/uwrcc/calendar/

Hyak consists of two independent clusters: ikt.hyak (hyak classic) and mox.hyak (hyak next-gen). All new nodes are added to mox.hyak.

ikt

ikt.hyak has hundreds of nodes, each comparable to a high-end PCserver. A typical node on ikt.hyak has 16 processor cores and 128GB at least 64GB of memory. All the ikt.hyak nodes run CentOS 6 Linux, and they are tied together by the Moab Slurm cluster software. The user tasks are typically submitted through TORQUE scheduler although simple programs can also be run directly. the Slurm scheduler. ikt.Hyak is made of several generations of hardware with different levels of performance, see Hyak Node Hardware for more details. Most of the nodes are devoted for computing but the cluster also includes a few "login-nodes", dedicated to logins, file transfers, and similar communication tasks. There are also nodes that belong to the sponsoring institutions, and nodes for compiling and testing the programs.

mox

mox.hyak contains hundreds of nodes, each comparable to a high-end server. A typical node on mox.hyak has at least 28 processor cores and at least 128GB of memory. All the mox.hyak nodes run CentOS 7 Linux, and they are tied together by the Slurm cluster software. The user tasks are submitted through the Slurm scheduler.

For an overview of mox please see Hyak mox Overview.

Purchasing Hyak Capacity

CPU and storage options are available for purchase to Hyak users associated with sponsored campus units. New nodes will be part of mox.hyak.

This information is provided for preliminary planing purposes only. Please use your UW e-mail to  contact help@uw.edu for assistance in preparing actual Hyak hardware configurations.

https://itconnect.uw.edu/service/shared-scalable-compute-cluster-for-research-hyak/

Any faculty or PI can send an e-mail to help@uw.edu and request a "Welcome to Hyak Tutorial" to be held at their lab or department.

Getting Started

Please review everything in this section before contacting Hyak support for help with new accounts.
This section also contains useful information on life with two-factor authentication, including setting up SSH tunnels to reduce hassles.

...

http://software-carpentry.org/workshops/

You should be familiar with Hyak basics:

Hyak_101

Code Pre-requisites:

Before you start running your code to hyak, ensure it works on your own computer, or even better, in your lab network. Be sure you know how to keep all the cores busy. For example, using GNU Parallel for weak-scaling, or MPI for strong-scaling.

A user connects to hyak by using ssh. Mac and Linux machines have ssh. Windows users will have to install appropriate ssh software.

Software and Development Tools

...

Hyak offers a variety of filesystems, each best suited for a different set of circumstances.
Along with selecting the best filesystem for the job, this section includes instructions for moving your data to and from Hyak.

...

All Hyak use is mediated by the system scheduler.
This section provides details on using the scheduler to run interactive, batch, and parallel jobs.
It also includes instructions for using the scheduler to monitor your
jobs and the cluster status.

Hyak uses TORQUE scheduler. It is controlled through several commands like qsub for
starting a job or qdel for canceling a job.

The jobs are supposed to be submitted via login nodes. Submissions
from other nodes will most likely fail with an error message, telling that you don't have the right to submit a jobs from that node.

...

ikt.hyak uses the slurm scheduler. Jobs must be submitted via login nodes.

mox.hyak uses the slurm scheduler. Jobs must be submitted via the login nodes.

Below page is for both ikt.hyak and mox.hyak.

Mox_scheduler


Getting Help

If you use the STF (Student Technology Fee) allocation, please join the mailing list for the HPC RC club
(httphttps://studentsdepts.washington.edu/hpccuwrcc/) and direct all questions there. Do not contact UW-IT for support.

The principle means of support we provide is this Wiki - a comprehensive set of documentation covering all basic Hyak functions. If you encounter problems with one of Hyak's basic functions use your UW e-mail to send an e-mail to help@uhelp@uw.washington.edu with 'hyak' as the first word in the subject and a brief description of your issue. If you do not use your UW provided e-mail account, please include your UW NetID in your e-mail. If you're reporting a problem with a job or the job scheduler, please include at least one affected jobid as well as paths to the job script and job stdout (<jobname>.o<jobid>)..

RC Club Office Hours: http://depts.washington.edu/uwrcc/calendar/

We operate Hyak as a campus resource for research computing at scale. It provides a capable, stable platform upon which users can build complex, domain specific software environments and workflows. Our end-user support resources are limited, covering the system hardware, operating system, filesystems, networks, job scheduler, and a defined set of development tools and applications. Users are responsible for building, installing, maintaining and running any other applications.

To have a good experience using Hyak, or any other HPC/supercomputer system, [ we offer these general recommendations|:

WIKI for Hyak users - Guidelines]

For users building complex codes from sources, please read this.

Other Help

Hyak Tutorials

Here are a few tutorials written for different tasks. While not directly applicable to all users' work, they're worth taking a look at to get an idea about how some common tasks are accomplished using Hyak.

...

Hyak How-tos

  • Hyak HOWTO is a list of recipies articles for different tasks, including how to use RPython, Python and Matlab , and how to run Windows R on hyak.

Hyak wiki pages Index

Hyak_wiki_Index

 

Below is a list of Hyak wiki pages which do not start with the word Hyak.

...

Hyak User Contributions

Hyak users can contribute and find tips from other users on the Hyak User Contributions site.

Feature Requests

...

  • This work was facilitated though through the use of advanced computational, storage, and networking infrastructure provided by the Hyak supercomputer system at the University of Washington

When you cite Hyak, please let us know! Send Please use your UW e-mail to send mail to help@uw.edu with Hyak as the first word in the subject along with a citation we can use in the body of the message. Likewise please let know of successful funding proposals and research collaborations which relied, at least in part, on Hyak for their success.

...