BIRCH

High Performance Compute Nodes 
September 15, 2021



Reference:  CC compute cluster

Users on the CCL system run desktop sessions on the four login servers: jupiter, mars, venus and neptune. There are 14 identical servers specifically designated for resource-intensive jobs, named cc01, cc02, cc03 ... cc14. The idea is that jobs requiring a lot of RAM, cores, or compute time should not slow down the desktop sessions on the four login servers. This tutorial describes how to use these compute nodes from your desktop session.

1. Set up your account to allow login to other CC machines without being prompted for a password.

While not required for using compute nodes, it is usually easier to login by ssh with authentication by pre-shared key, rather than a password.

ssh-keygen -t rsa








Create a public/private key pair. These files will be written to your $HOME/.ssh directory as id_rsa.pub and id_rsa, respectively.
If you already have id_rsa and id_rsa.pub in your .ssh directory, you can skip this step.

When prompted for a password, it's okay to simply press ENTER and continue.

THIS STEP MAY TAKE A FEW MINUTES!
cd $HOME/.ssh
cat id_rsa.pub >> authorized_keys

cd


Copy the public key to your authorized_keys file.

Return to hour $HOME directory.
This works because we have the same userid and $HOME directory on all CCL machines. If you were trying to set up passwordless login to servers not on the cc.umanitoba.ca domain, the process need a few other steps.

At this point you should be able to login to any CCL host without a password eg.

ssh cc07


should log you into cc07.cc.umanitoba.ca without prompting for a password.

Using Compute Nodes from  off-campus

You can't directly login to ccxx compute nodes from off-campus. They are only accessible through the campus network. Any of the  following will work:
  • first connect to a login host (eg. mars)  using a command line tool such as Putty, and then ssh to the compute node
  • first connect to a login host using Thinlinc, and then ssh to the compute node
  • connect to the backbone using a VPN, then ssh to the compute node.
As well, logging into CCL hosts from systems outside the cc.umanitoba.ca domain will still require a password.

2. Find a compute node that is not currently under heavy use.

The load average gives an estimate of how busy a machine is. The load average tells the average number of jobs waiting to be run on a core. As a rule of thumb, if the load average is less than the number of cores, the system should run efficiently. Performance will degrade as the load average increases. If you want to run a resource-intensive job, it is best to login to a server with a low load average.

The rupcc command is a local command that lists the most recent load averages on cc.umanitoba.ca machines. There are quite a large number of machines, so to limit the output to just the 15 compute nodes, run the command as

rupcc

cc01                     up   8 days,  2:19,    load average: 15.75 15.91 16.33
cc02                     up  17 days,  5:44,    load average: 1.00 1.03 1.06
cc03                     up  17 days,  5:44,    load average: 1.05 1.08 1.12
cc04                     up  17 days,  5:43,    load average: 0.04 0.07 0.10
cc05                     up  17 days,  5:43,    load average: 11.07 9.49 8.82
cc06                     up  17 days,  5:43,    load average: 0.02 0.09 0.13
cc08                     up  17 days,  5:43,    load average: 20.58 20.83 20.95
cc10                     up  17 days,  5:42,    load average: 0.00 0.05 0.09
cc11                     up  17 days,  5:08,    load average: 0.00 0.08 0.09
cc12                     up  17 days,  5:20,    load average: 0.00 0.02 0.05
cc13                     up  17 days,  5:48,    load average: 0.06 0.03 0.05
cc14                     up  17 days,  5:50,    load average: 0.05 0.03 0.05


The output lists the compute nodes, and the three most recent load averages in the three rightmost columns. The highest loads are on cc01, cc05 and cc8, so we will avoid those servers. Keeping in mind that each server has 64 cores, the load averages are still well within the range for efficient performance. The other servers have very small load averages, so any of those would be preferred.

There is now a shortcut command called sshcc that automatically chooses a machine that isn't busy and logs you in.

{venus:/home/psgendb} sshcc
Last login: Sat Jul 25 17:09:18 2020 from venus.cc.umanitoba.ca
##      University of Manitoba IST Linux - Red Hat Enterprise Linux 7
## IST Compute platform node
Welcome to CCL Linux!  Please send comments/suggestions to:
       servicedesk@umanitoba.ca

{cc04:/home/psgendb}

3. Login to the compute node, and run your program.

Once you have chosen a compute node, simply login to the node using ssh.

In the following example, the user will login to cc03 and launch bldna.
Alternatively, use ssh.

{venus:/home/psgendb}ssh -X cc03
Last login: Wed Sep 26 15:01:30 2018 from venus.cc.umanitoba.ca
                CNS Linux - Red Hat Enterprise Linux 6
IST Compute platform node.
Welcome to CCL Linux!  Please send comments/suggestions to:
       servicedesk@umanitoba.ca
{cc03:/home/psgendb}







The user's terminal window is currently on venus. ssh -X permits programs with graphical interfaces. Now any job launched within this terminal window will run on cc03, but windows will display on the user's venus desktop. (If you plan to run ONLY command line programs, you don't need -X in the ssh command.)



{cc03:/home/psgendb}cd tutorials/bioLegato/database.sim






Optional: Go to the directory where you want to work. When you log into the compute node, you will be back in your home directory. If you want to work in a different directory, you must cd into that directory. An example is shown at left.
{cc03:/home/psgendb/tutorials/bioLegato/database.sim}bldna
Launch bldna
Note that the header bar indicates that this BioLegato window is running on cc03.

4. Don't forget to logout when you're done!

It is critical not to waste machine resources on compute nodes. For that reason, when you are finished, end your BioLegato session, and logout.

{cc03:/home/psgendb}logout
Connection to cc03 closed.
{venus:/home/psgendb}

The session returns to the original login host eg. venus,
 as indicated in the prompt.