Using High Performance Compute Nodes

High Performance Compute Nodes

October 12, 2023

Users on the CCL system run desktop sessions on the four login servers: gaia, mars, mercury and venus. There are 12 identical servers specifically designated for resource-intensive jobs, named cc01, cc02, cc03 ... cc12. The idea is that jobs requiring a lot of RAM, cores, or compute time should not slow down the desktop sessions on the four login servers. This tutorial describes how to use these compute nodes from your desktop session.

1. Set up your account to allow login to other CC machines without being prompted for a password.

While not required for using compute nodes, it is usually easier to login by ssh with authentication by pre-shared key, rather than a password.

`ssh-keygen -t rsa`	Create a public/private key pair. These files will be written to your $HOME/.ssh directory as id_rsa.pub and id_rsa, respectively. If you already have id_rsa and id_rsa.pub in your .ssh directory, you can skip this step. When prompted for a password, it's okay to simply press ENTER and continue. THIS STEP MAY TAKE A FEW MINUTES!
`cd $HOME/.ssh cat id_rsa.pub >> authorized_keys cd`	Copy the public key to your authorized_keys file. Return to hour $HOME directory.
This works because we have the same userid and $HOME directory on all CCL machines. If you were trying to set up passwordless login to servers not on the cc.umanitoba.ca domain, the process need a few other steps.

At this point you should be able to login to any CCL host without a password eg.
ssh cc07

should log you into cc07.cc.umanitoba.ca without prompting for a password.

Using Compute Nodes from off-campus

You can't directly login to ccxx compute nodes from off-campus. They are only accessible through the campus network. Any of the following will work:

first connect to a login host (eg. mars) using a command line tool such as Putty, and then ssh to the compute node
first connect to a login host using Thinlinc, and then ssh to the compute node
connect to the backbone using a VPN, then ssh to the compute node.

As well, logging into CCL hosts from systems outside the cc.umanitoba.ca domain will still require a password.

2. Find a compute node that is not currently under heavy use.

The load average gives an estimate of how busy a machine is. The load average tells the average number of jobs waiting to be run on a core. As a rule of thumb, if the load average is less than the number of cores, the system should run efficiently. Performance will degrade as the load average increases. If you want to run a resource-intensive job, it is best to login to a server with a low load average.

The rupcc command is a local command that lists the most recent load averages on cc.umanitoba.ca machines. There are quite a large number of machines, so to limit the output to just the 12 compute nodes, run the command as

supcc

cc01 up 8 days, 2:19, load average: 15.75 15.91 16.33 cc02 up 17 days, 5:44, load average: 1.00 1.03 1.06 cc03 up 17 days, 5:44, load average: 1.05 1.08 1.12 cc04 up 17 days, 5:43, load average: 0.04 0.07 0.10 cc05 up 17 days, 5:43, load average: 11.07 9.49 8.82 cc06 up 17 days, 5:43, load average: 0.02 0.09 0.13 cc08 up 17 days, 5:43, load average: 20.58 20.83 20.95 cc10 up 17 days, 5:42, load average: 0.00 0.05 0.09 cc11 up 17 days, 5:08, load average: 0.00 0.08 0.09 cc12 up 17 days, 5:20, load average: 0.00 0.02 0.05

The output lists the compute nodes, and the three most recent load averages in the three rightmost columns. The highest loads are on cc01, cc05 and cc8, so we will avoid those servers. Keeping in mind that each server has 80 cores, the load averages are still well within the range for efficient performance. The other servers have very small load averages, so any of those would be preferred.

There is now a shortcut command called sshcc that automatically chooses a machine that isn't busy and logs you in.

{venus:/home/psgendb} sshccLast login: Sat Jul 25 17:09:18 2020 from venus.cc.umanitoba.ca## University of Manitoba IST Linux - Red Hat Enterprise Linux 7## IST Compute platform nodeWelcome to CCL Linux! Please send comments/suggestions to: servicedesk@umanitoba.ca
{cc04:/home/psgendb}

3. Login to the compute node, and run your program.

Once you have chosen a compute node, simply login to the node using ssh.

In the following example, the user will login to cc03 and launch bldna.
Alternatively, use ssh.

`{venus:/home/psgendb}ssh -X cc03Last login: Wed Sep 26 15:01:30 2018 from venus.cc.umanitoba.ca` `CNS Linux - Red Hat Enterprise Linux 6IST Compute platform node.` `Welcome to CCL Linux! Please send comments/suggestions to:` `servicedesk@umanitoba.ca{cc03:/home/psgendb}`	The user's terminal window is currently on venus. ssh -X permits programs with graphical interfaces. Now any job launched within this terminal window will run on cc03, but windows will display on the user's venus desktop. (If you plan to run ONLY command line programs, you don't need -X in the ssh command.)
`{cc03:/home/psgendb}cd tutorials/bioLegato/database.sim`	Optional: Go to the directory where you want to work. When you log into the compute node, you will be back in your home directory. If you want to work in a different directory, you must cd into that directory. An example is shown at left.
`{cc03:/home/psgendb/tutorials/bioLegato/database.sim}bldna`	Launch bldna
	Note that the header bar indicates that this BioLegato window is running on cc03.

4. Don't forget to logout when you're done!

It is critical not to waste machine resources on compute nodes. For that reason, when you are finished, end your BioLegato session, and logout.

{cc03:/home/psgendb}logout Connection to cc03 closed. {venus:/home/psgendb} The session returns to the original login host eg. venus,
as indicated in the prompt.