Mapping of GDE file formats to GDE Sequence Wrapper



value
GDE flat
GDE
GenBank
File extension

.flat
.gde
.gen, .gb, .gp
Name
non-blank string
first line
name
LOCUS
Accession
non-blank string

sequence-ID
ACCESSION
GI
integer


VERSION
Description
text

descrip
DESCRIPTION
Type
DNA|RNA|protein|text|mask
read from first character of each name line
type
LOCUS
Topology
linear|circular
defaults to linear
circular [0|1]
LOCUS
Annotation
text

comment
all lines from LOCUS to ORIGIN
Sequence
text
lines read after a name line, terminating with the next name line
sequence
read after the ORIGIN line and terminating with a "//" line.

Notes:

1. All files may contain 1 or more sequences.
2. We need to be able to read, write, and programmatically manipulate each of these fields. The latter means
 that we need to be able to

GDE2.2 Manual
GenBank Release Notes
Sample GenPept protein entry: P13240