BIRCH

High Performance Compute Nodes 
Nov. 21, 2018



Reference:  CC compute cluster

Users on the CCL system run desktop sessions on the three login servers, jupiter, mars and venus. There are 12 identical servers specifically designated for resource-intensive jobs, named cc01, cc02, cc03 ... cc12. The idea is that jobs requiring a lot of RAM, cores, or compute time should not slow down the desktop sessions on the three login servers. This tutorial describes how to use these compute nodes from your desktop session.

1. Set up your account to allow login to other CC machines without being prompted for a password.

While not required for using compute nodes, it is usually to login by ssh with authentication by pre-shared key, rather than a password.

ssh-keygen -t rsa








Create a public/private key pair. These files will be written to your $HOME/.ssh directory as id_rsa.pub and id_rsa, respectively.
If you already have id_rsa and id_rsa.pub in your .ssh directory, you can skip this step.

THIS STEP MAY TAKE A FEW MINUTES!
cd $HOME/.ssh
cat id_rsa.pub >> authorized_keys

cd


Copy the public key to your authorized_keys file.

Return to hour $HOME directory.
This works because we have the same userid and $HOME directory on all CCL machines. If you were trying to set up passwordless login to servers not on the cc.umanitoba.ca domain, the process need a few other steps.

At this point you should be able to login to any CCL host without a password eg.

ssh cc07


should log you into cc07.cc.umanitoba.ca without prompting for a password. As well, logging into CCL hosts from systems outside the cc.umanitoba.ca domain will still require a password.

2. Find a compute node that is not currently under heavy use.

The load average gives an estimate of how busy a machine is. The load average tells the average number of jobs waiting to be run on a core. As a rule of thumb, if the load average is less than the number of cores, the system should run efficiently. Performance will degrade as the load average increases. If you want to run a resource-intensive job, it is best to login to a server with a low load average.

The rupcc command is a local command that lists the most recent load averages on cc.umanitoba.ca machines. There are quite a large number of machines, so to limit the output to just the 12 compute nodes, run the command as

rupcc

cc01                     up  43 days, 14:48,    load average: 0.07 0.06 0.03
cc02                     up  39 days, 21 mins,  load average: 0.17 0.11 0.09
cc03                     up  43 days, 14:30,    load average: 0.08 0.10 0.11
cc04                     up  38 days, 23:05,    load average: 0.09 0.10 0.09
cc05                     up  43 days, 14:47,    load average: 6.56 6.44 6.17
cc06                     up  39 days, 28 mins,  load average: 0.15 0.12 0.09
cc07                     up  43 days, 14:48,    load average: 0.18 0.10 0.22
cc08                     up  39 days, 21 mins,  load average: 0.01 0.06 0.07
cc09                     up  43 days, 14:48,    load average: 0.03 0.07 0.08
cc10                     up  39 days, 21 mins,  load average: 15.76 15.38 15.60
cc11                     up  43 days, 14:47,    load average: 0.00 0.03 0.06
cc12                     up  39 days, 21 mins,  load average: 0.07 0.12 0.09



The output lists the compute nodes, and the three most recent load averages in the three rightmost columns. The highest loads are on cc05 and cc10, so we will avoid those servers. Keeping in mind that each server has 64 cores, the load averages are still well within the range for efficient performance. The other servers have very small load averages, so any of those would be preferred.

Because usage changes on the compute nodes, you should always run rupcc before you choose a host to login to.

3. Login to the compute node, and run your program.

Once you have chosen a compute node, simply login to the node using ssh.

In the following example, the user will login to cc03 and launch bldna.

{venus:/home/psgendb}ssh -X cc03
Last login: Wed Sep 26 15:01:30 2018 from venus.cc.umanitoba.ca
                CNS Linux - Red Hat Enterprise Linux 6
IST Compute platform node.
Welcome to CCL Linux!  Please send comments/suggestions to:
       servicedesk@umanitoba.ca
{cc03:/home/psgendb}







The user's terminal window is currently on venus. ssh -X permits programs with graphical interfaces. Now any job launched within this terminal window will run on cc03, but windows will display on the user's venus desktop. (If you plan to run ONLY command line programs, you don't need -X in the ssh command.)



{cc03:/home/psgendb}cd tutorials/bioLegato/database.sim






Optional: Go to the directory where you want to work. When you log into the compute node, you will be back in your home directory. If you want to work in a different directory, you must cd into that directory. An example is shown at left.
{cc03:/home/psgendb/tutorials/bioLegato/database.sim}bldna
Launch bldna
Note that the header bar indicates that this BioLegato window is running on cc03.

4. Don't forget to logout when you're done!

It is critical not to waste machine resources on compute nodes. For that reason, when you are finished, end your BioLegato session, and logout.


{cc03:/home/psgendb}logout
Connection to cc03 closed.
{venus:/home/psgendb}

The session returns to the original login host eg. venus,
 as indicated in the prompt.