BIRCH

TUTORIAL:

Annotating a sequence using NCBI Sequin


Oct. 21, 2014

NCBI Sequin - Getting Started with Sequin http://www.ncbi.nlm.nih.gov/projects/Sequin/gettingstarted.html
NCBI Sequin - Quick Guide http://www.ncbi.nlm.nih.gov/projects/Sequin/QuickGuide/sequin.htm


This tutorial takes you through an example of how to use the NCBI Sequin program to  annotate and submit a sequence to GenBank.

In the previous tutorial, we created a synthetic plasmid construct containing a GUS gene from pBI101 cloned between the SacI and BamHI sites of pBluescriptKSm13+. This sequence is in a Pearson/Fasta file called 'pBSGUS.fsa'. 

This tutorial will annotate the two components of the synthetic sequence, and also the locations of the coding regions for the GUS gene and the Ampicillin Resistance gene (beta-lactamase) from the Bluescript vector.  This is critical, because it documents precisely what you have done. The ability to reproduce results is as important in computers as it is in the lab. GenBank format is the richest and most versatile sequence file format, and it is read by most sequence programs. SEQUIN automates the process of creating GenBank format files.

The menus in SEQUIN walk you through a step-by-step process of the minimal information needed for a GenBank entry. Without going into every step, the over all series of events is as follows:
1. Start SEQUIN by typing 'sequin' at the command line. Or you can start SEQUIN through the BIRCH, and choose DNA --> Sequin - send annotated sequences to genbank.

2. Choose "Start new submission"

3. Fill in information in the Submission, Contact, Authors and Affiliation (all required). Make sure under submission, "tentative title for the manuscript" you fill in a title before proceeding .  Follow towards the next form.
sequincontact
4. You will see a window entitled 'Sequence Format'. Continue with the default settings.

5. On the page entitled Nucleotide, click on "Import Nucleotide FASTA".
sequinnucleotide

A new window will appear, and then import your .fsa file from the folder you saved it in.
fileselection

6.  By importing the .fsa file you will be able to Click on "Specify Topology". Set the topology to 'Circular'.

7. On the Organism page, click on the Add Organisms, Locations, and Genetic Codes.
sequinorganism

A new window will appear and type in 'synthetic construct' under the heading Organism.

organismeditor

8. This is the minimum information needed to create a GenBank entry  that can be used as a model for a restriction digest. Once the minimal information has been entered, follow the 'Next Page' links until a window pops up with the GenBank entry in it. (For other purposes, you may wish to annotate locations of coding sequences and other features of interest. In a laboratory setting, if you were planning to submit the sequence to GenBank, the most critical things to annotate for a construct such as this are the precise sources of the component sequences, in the FEATURES TABLE, along with a simple explanation in words in the DEFINITION line.)

9. To export your sequence to a GenBank file, choose File --> Export GenBank. Save your sequence as pBSGUS.gen.
A good introduction to SEQUIN, including screen shots, can be fund at http://www.ncbi.nlm.nih.gov/Sequin/.

Test your GenBank file by reading it into bldna, and running BACHREST. The BACHREST output should show that pBSGUS is circular, and the BamHI  and SacI sites at 1 and 1892, respectively (pBSGUS.bachrest).