RUNNING LARGE JOBS
1. What kinds of jobs tend to be CPU intensive?
Distance matrix methods (eg. Neighbor Joining,
require negligible time; time increases roughly linearly with the
Parsimony (eg. DNAPARS, PROTPARS) - moderately
time increases exponentially with the number of sequences
Maximum likelihood (DNAML, PROTML, fastDNAML) -
time increases according to a FACTORIAL function of the number of
Sequence database searches - time
product of sequence length and database size; use high k values
to speed up search; protein searches faster than DNA.
Multiple sequence alignments - cluster type
increase roughly linearly in proportion to the number of sequences
Retrievals of large numbers of sequences -
to number of sequences
any sorting operation with a large number
- efficiency depends on sort algorithm
SAS - some SAS jobs can take a long time
2. What kinds of jobs should never
be CPU intensive?
If the following applications are eating up
percentages of CPU time, they are not functioning normally, and are
bioLegato - bioLegato by itself does almost
nothing. One exception is when
reading in enormous sequence files eg. large numbers of sequences of
long sequences. This can take a few minutes. (Note that biolegato will
always appear as 'java' in the output from 'ps' and from 'top'. This is
because bioLegato is a Java program, and runs in a Java Virtual Machine
Most user apps (eg. word processors, mailers,
Desktop tools (eg. GNOME desktop tools such as the gedit
Most Unix commands
Web browsers - For short bursts Firefox and
Mozilla can be very
CPU intensive, but this should not persist more than a minute or two.
3. Some bad habits to avoid
Clicking repeatedly when an application hangs
this does is to make things worse. Most often applications slow down
of network slowdowns, which you can't do anything about.
Logging out with a screenful of windows -
each window before logging out. Sometimes, apps don't terminate and
running even after you logout.
Running numerous CPU-intensive jobs
- Running 5 fasta jobs at once will simply cause all jobs to take 5
as long to run. If the jobs use a lot of memory, they will run much
slowly than that because they will be repeatedly swapping in and out of
memory. Run big jobs sequentially.
3. Start small and work your way up.
If you are working with a very CPU-intensive
or a large number of sequences (eg. greater than 20) or both, you
try to get an idea of how long your job will take. The BIRCH
of bioLegato records the time used by most of the CPU-intensive
appends it to the output (eg. .outfile, .fasta). For example, a
of DNAML4 with increasing numbers of genes for PAL (phenylalanine
lyase) gave the following times:
5 sequences - Execution
times on goad: 4.0u 0.0s 0:05 79% 0+0k 0+0io 0pf+0w
10 sequences - Execution times
on goad: 47.0u 0.0s 0:49 95% 0+0k 0+0io 0pf+0w
15 sequences - Execution times
on goad: 144.0u 0.0s 2:31 95% 0+0k 0+0io 0pf+0w
The times listed left to right are:
User time - CPU time in
used by the program eg. DNAML4
System time - CPU time in
by the system to run the program, usually negligible
Elapsed time - real time
start and end of job.
4. Monitoring your jobs
Which jobs are eating up the most
time on the machine I am currently logged into?
The top command gives you a real time picture of
the most CPU intensive jobs currently running on the server you are
into. Type 'top' at the command line:
load averages: 0.68, 0.39, 0.32 12:56:44
This display is updated every few seconds in the
229 processes: 227 sleeping, 2 on cpu
CPU states: 71.2% idle, 26.2% user, 2.6% kernel, 0.0% iowait, 0.0% swap
Memory: 16G real, 12G free, 1578M swap in use, 21G swap free
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
18950 umchan94 1 1 8 10M 9600K cpu/3 0:27 18.10% dnaml
18870 umchan94 21 1 0 394M 142M sleep 0:23 0.71% java
22575 kdc 3 1 0 158M 104M sleep 69:16 0.66% mozilla-bin
9079 umchan94 1 1 0 93M 92M sleep 114:36 0.60% Xvnc
786 root 2 59 0 4784K 3664K sleep 654:59 0.46% automountd
28905 umchan94 3 59 0 182M 94M sleep 46:17 0.16% mozilla-bin
18952 umchan94 1 59 0 5032K 2232K cpu/2 0:00 0.14% top
13432 groff 1 1 0 68M 32M sleep 508:21 0.13% mixer_applet2
14583 operac2 1 40 0 69M 33M sleep 467:07 0.13% mixer_applet2
9170 umchan94 1 1 0 68M 33M sleep 209:16 0.13% mixer_applet2
16400 umchan94 2 44 0 80M 27M sleep 0:02 0.12% gnome-terminal
9129 umchan94 1 27 0 63M 31M sleep 6:57 0.05% metacity
9133 umchan94 1 1 0 97M 62M sleep 50:29 0.05% gnome-panel
20832 umchan94 2 41 0 89M 27M sleep 7:53 0.05% gnome-terminal
13430 groff 1 57 0 68M 11M sleep 131:59 0.04% gnome-netstatus
To quit, type 'q'.
load average - CPU load averaged over
time increments (usually a few seconds). Even with lots of users doing
normal tasks, this is seldom greater than 1.0. CPU intensive jobs like
DNAML can push it much higher. Above a load average of 4, system
PID - process ID. This is the number
need to know to kill a job.
USERNAME - who owns the job
NICE - governs the percentage of CPU
a job can use. Low NICE values are needed by user apps such as Web
or word processors because things like cursors and scrollbars need to
instantly. Number crunching programs should run at higher NICE values
that they don't impede the overall performance of the system. bioLegato
most CPU-intensive programs with a suitably high nice value. (See man
for 'nice' and 'renice' commands).
SIZE - memory used by an application.
TIME - Time elapsed since. Most apps
up far less than 1 minute. A Netscape session can use up several
CPU - percentage of CPU time being
used. Note that DNAML eats up a lot of CPU time because it does a very
exhaustive set of calculations in constructing a phylogeny. The 'java'
job also shown abouve is actually biolegato.
COMMAND - the command being run
The top command has a lot of great options.
example, you can sort jobs by memory used, or list only jobs
a given userid. You can even kill jobs directly in top. Type 'man top'
to read about them.
Which jobs are currently
my userid on the machine I am currently logged into?
The ps command with no arguments tells
jobs are running in the current shell (the current window):.
ps -u userid tells which jobs are running under a
given userid on the host you are logged into
PID TTY TIME CMD
28018 pts/19 0:00 csh
25266 pts/19 0:00 csh
28022 pts/19 0:02 gde
ps -u frist
PID TTY TIME CMD
25225 pts/16 0:00 dsdm
25252 ? 0:01 dtprinti
27938 ? 0:01 xman
25251 ? 0:01 clock
24552 pts/11 0:15 Xvnc
27919 pts/18 0:00 sh
25250 ?? 0:01 dtterm
24555 pts/11 0:00 Xsession
24643 ? 0:05 dtwm
25241 pts/18 0:00 dtsessio
28018 pts/19 0:00 csh
25249 ? 0:02 dtmail
5. Killing unwanted jobs
You can only kill jobs belonging to you
You can only kill jobs on the host machine to
which you are
currently logged in.
To kill a single job
To kill a job just type 'kill -9 PID'
For example, to kill the dtmail mailer
kill -9 25249
To kill multiple jobs
A list of jobs can be included in a
kill -9 25249 25252 27938
If your terminal screen is frozen, kill jobs
If your screen locks up, you can log
into the same host machine from another terminal and kill jobs from
If you see one job eating up a lot of CPU time, kill that first. It is
probably the one that caused the screen to freeze up in the first
and killing that job will usually free up the screen.