return to tutorials |
TUTORIAL:
|
Oct. 21, 2014 |
This tutorial goes over an example of
simple sequence tasks using the command line program NUSEQ. The
next tutorial will show you how to run NUMSEQ from the BioLegato
graphic interface.
NUMSEQ documentation: $doc/fsap/numseq.txt
BACHREST documentation: $doc/fsap/rest.txt
Mac Users: To open a terminal window,
go to Applications
-->Utilities --> Terminal |
{brassica:/home/plants/frist}cd
do next
step if $HOME/tutorials doesn't exist
{brassica:/home/plants/frist}mkdir
tutorials
create directory
for this tutorial
{brassica:/home/plants/frist}mkdir
tutorials/sequence
{brassica:/home/plants/frist}cd
$birch/tutorials/bioLegato/sequence
the location of birch
($birch) is /home/psgendb in this example
{brassica:/home/psgendb/tutorials/bioLegato/sequence}cp
*.gen
$HOME/tutorials/sequence
copying GenBank files
to new directory
return to $HOME directory verify that
new files and directories are
present
{brassica:/home/plants/psgendb/tutorials/bioLegato/sequence}cd
{brassica:/home/plants/frist}ls
-l
drwx------
1
frist drr 512 Oct 31
10:11 tutorials/
{brassica:/home/plants/frist}cd
tutorials
{brassica:/home/plants/frist/tutorials}ls
-l
drwx------
3
frist drr 512 Oct 31
10:11 sequence/
{brassica:/home/plants/frist/tutorials}cd
sequence
{brassica:/home/plants/frist/tutorials/sequence}ls
-l
-rw------- 1 frist
frist 5404 Oct 31 10:13 X52331.gen
-rw------- 1 frist frist
10739 Oct 31 10:13 PBI101TD.gen
-rw------- 1 frist
frist 8278 Oct 31 10:13 pBSGUS.gen
-rw------- 1 frist
frist 3674 Oct 31 10:13 PEACAB15.gen
Files with the .gen extension are in GenBank format. Since these
are ASCII text files, you can view them in any text editor. Double
clicking on a file in the file manager will bring up the file in
the default text editor for your bioLegato installation.
NUMSEQ is a program for printing out, translating, and subcloning sequences. It runs at the command line. The main menu handles file input and output. Output can either be to the screen or to a file. In the example, the output file has been called PEACAB15.numseq to indicate that the file contains output from numseq.
Example: Reading and printing PEACAB15.gen with NUMSEQ
The parameter menu controls how the sequence is printed. Type '4' in the main menu to bring up the Parameters menu.
Name: PEACAB15 Topology: LINEAR Length: 822 nt
________________________________________________________________________________
Parameter Description/Response Value
________________________________________________________________________________
1)START first nucleotide printed 1
2)FINISH last nucleotide printed 822
3)NUCCASE U:(A,G,C,T...), l:(a,g,c,t...) U
4)STARTNO number of starting nucleotide 1
5)GROUP number every GROUP nucleotides 10
6)GPL number of GROUPs printed per line 7
7)WHICH I: input strand O: opposite strand I
8)STRANDS 1: one strand, 2:both strands 1
9)KIND R:RNA D:DNA D
10)NUMBERS Number the sequence (Y or N) Y
11)NUCS Print nucleotide seq. (Y or N) Y
12)PEPTIDES Print amino acid seq. (Y or N) N
13)FRAMES 1 for this frame, 3 for 3 frames 1
14)FORM L:3 letter amino acid, S: 1 letter L
________________________________________________________________________________
Type number of parameter you wish to change (0 to continue)
By default, NUMSEQ will print out the entire sequence (from START
to FINISH) as a single strand (STRANDS) in 7 groups (GPL) of 10
nucleotides (GROUP) per line. To change parameters, type the
number of a parameter, and you will be prompted for a new value.
When you're ready to view the sequence with the new parameters,
type '0' at the prompt and '5' in the main menu to print the
sequence to the screen.
To view both strands:
8) STRANDS: 2
To translate in 3 reading frames:
12) PEPTIDES: y
13) FRAMES: 3
5) GROUP: 15
6) GPL: 4
NUMSEQ breaks up the sequence into groups of nucleotides, numbering each group. For translation, GROUP must be divisible by 3, because translation is done in discrete codons of 3 bases each. GPL is set to 4 so that the output line will fit on a typical 80-character line.
To limit printing to only part of the sequence eg. bases 200 - 400:
1) START: 200
2) FINISH: 400
To view the opposite strand of the same region:
7) WHICH: o
1) START: 400
2) FINISH: 200
This example illustrates that creating an opposite strand requires two steps. First, we have to specify the strand as 'o' (opposite) rather than 'i' (input strand). This causes the bases to be complemented. However, if all we do is complement the input strand, then the opposite strand would be printed 3' to 5', because we would be starting at 200 and ending at 400. Therefore, START must be set to 400, and FINISH to 200.