Child pages
  • Mox_memory_cpu_usage
Skip to end of metadata
Go to start of metadata

Below abc is your userid and xyz is your group.


(1) Use below command on mox to get the jobID for your job whose memory and CPU usage you want to monitor.

squeue -u abc


(2) At your desktop or laptop on campus, login to below web-site
https://job-profiling.hyak.uw.edu/

If you are accessing the above web-site from an off-campus location then first login to Husky OnNet.


(3) Enter the jobID from step (1) in the job text box in the upper left corner of the job profiling web-page and then click on some other part of the web-page.


(4) Select "All" for the drop down menus for host, step and task in the upper left corner of the job profiling web-page.


(5) The graphs will refresh automatically at the default refresh rate. You can change the refresh rate by clicking on the "Refresh every" button in the upper right corner. A sub-window will pop-up. Go down to the bottom of the sub-window. Click on the drop down menu next to "Apply". Select the refresh rate (e.g. 10s) and click "Apply". You will see the new refresh rate in the upper right corner.


(6) The graphs will be for  the default time range. You can change the default time range by clicking on the "Refresh every" button in the upper right corner. A sub-window will pop-up. There are various choices under the "Quick Range" section of the sub-window. Click on a time range (e.g. Last 3 hours) and click "Apply". You will see the new time range in the upper right corner.


(7) There are many graphs on the job profiling web-page. The most important ones for most users are the CPU Utilization and the memory usage graphs. If your program does a lot of reading from and writing to disk then you may also be interested in the data written/read graphs.

For the CPU utilization graph the y-axis units are in terms of 1 core. Hence if your program is using all the cores then the plot line should be near 28 for the 28-core mox nodes, near 32 for the 32-core mox nodes, and near 40 for the 40-core mox nodes. (It may be that your program cannot use all the cores since it may reach the maximum memory limit for the node with fewer cores.)

Limitations:

(a) Note that the graphs are not plotted in real time. There may be a few minutes time lag between an event happening on a node (e.g. increase in use of memory) and the event showing up on the graph.

(b) For certain types of jobs (especially multi-node jobs), the graphs may not contain all the relevant data. There may be missing nodes/steps/tasks.


Further Information:

For more details please see below link:

https://wiki.cac.washington.edu/display/hyakusers/Mox+Job+Profiling


  • No labels