birchdb - Database for documentation distributed with BIRCH

estano.csh reads all FASTA-formatted nucleotied sequence files in the current working directory with the file extension .fsa. Each file must contain a single sequence. For each file, blastx is run at NCBI using the blastcl3 client. The GI number of the blastx hit with the highest E-value is used to obtain additional information from the SeqHound server at SLRI [http://www.blueprint.org/seqhound/]. Output is sent to a .csv file, which can be directly imported by most spreadsheet programs.

outfile.csv contains a set of lines containing comma-separated fields. The following fields are included in the file:

(1) EST name:                  The name of the EST in the .fsa input file
(2) GI number:                NCBI GI number
(3) Taxonomy name:        The NCBI Taxonomy name, listing Genus and Species corresponding to the GI number.
(4) Protein name:            The NCBI protein name corresponding to the GI number.
(5) 3D Structure IDs:      The semicolon-seperated list of 3D structure IDs retrieved from SeqHound API
           corresponding to the GI number with E-value of 10¹¹or higher
(6) E-value:                      E-value for the highest blastx hit