GENE IDENTIFICATION - Find the open reading frame


  1. Make sure your DNA sequence is open in a text editor window (eg. Notepad or Wordpad).
  2. Create a new document an another text editor window, to hold the results of your sequence translation.
  3. Select your DNA sequence by dragging over the entire sequence with the mouse. Choose Edit --> Copy to copy to the clipboard.
  4. Click on the link below to go to the translation tool at the EXPASY.
  5. Choose Edit --> Paste to paste the sequence into the translation input window.
  6. Choose  Output format: Compact (M"","-",no spaces).
  7. Press the 'submit' button to translate the sequence. The program will translate in all three reading frames, on both strands, for a total of six translations. Copy the results into the new text edit window.
  8. Save your file by choosing File --> Save Page as. Save the file in TEXT format (.txt), NOT HTML (.html). Call the file 'translations.txt'. (see sample.translation).

TRANSLATE at EXPASY  http://www.expasy.org/tools/dna.html

Other translation programs:

TRANSLATE at the NIH http://bimas.dcrt.nih.gov:80/molbio/translate/
TRANSLATE at EBI  http://www2.ebi.ac.uk/translate/  

What the results mean

We now have six different amino acid sequences, but only one can be right. Which one? An "open reading frame" (ORF) is any stretch of DNA with no STOP codons. The correct ORF,  the one used in nature,  is almost always the longest ORF. In the example, frames 1 and 2 are both fairly long. Also, we need to take into account that sequencing errors could introduce frameshifts. Therefore, it would probably be best to do the database search (next page) with both amino sequences, to see which is the correct reading frame. 

previous page
RETURN TO "Bioinformatics: Gene Identification" next page