BIRCH
Local BLAST Databases
  Adding BLAST Databases

April  22, 2017


Databases can be added to your local BLAST database directory from BIRCH Administration Tool, which can be launched from the BIRCH launcher using File --> birchadmin, or from the command line by typing 'birchadmin'.

In birchadmin, choose UpdateAddInstall --> Add files to Blast Database.

1) Choose the FTP site from which to download databases using the pull-down menu labeled "Download from". It is always best to choose the FTPsite that is geographically closest to your location.

2) Choose databases to add
This menu lets you choose which databases to add. A  [+] or [-] indicates, respectively, whether a database is already installed, or not. By default, all databases are de-selected, indicated by "Do nothing." Add a database to the list of databases to install, choose Add.
If you select a database that is already listed as being installed, that database will not be downloaded.

Note that the databases are organized in to four tabs for Nucleotide, Protein, High-throughput or Other databases. Most of the "High-throughput" databases are nucleic sequences from high-throughput projects, such as ESTs, sequence-tagged sites. The one exception at this writing is Environmental - Protein. The other tab lists non-sequence databases, such as the taxonomy database, TaxDB. This is a small database, and includes taxonomy information needed for BLAST to identify species in BLAST reports, so it is probably useful to install this database.

3) Launch the job
For all but the smallest databases, it will take anywhere from a few minutes to several hours to download a database. To begin the download, click on Run. The job will run in the background, meaning that you can quit BioLegato and logout if you wish during the download. If your system is configured to send email, you will receive a log of download progress by email, once all downloads are complete. The email will be sent to the BIRCH Administrator's email address, which was set when BIRCH was installed, and also appears on the BIRCH home page of your local copy of the documentation.
 
Note:  The [+]/[-] indicators in the Add menu will not be updated to reflect addition of new databases until the next time you launch birchadmin after the download is completed.


At the command line - Suppose you wanted to add the Swissprot/Uniprot, vector and Taxonomy databases:

blastdbkit.py --add --dblist swissprot,vector,taxdb


blastdbkit --add --dblist all
Adds ALL databases from the remote FTP site. At this writing, that corresponds to about 850 Gb!


Important notes on est and pdbnt databases

Reference: BLAST FTP Site

The est and pdbnt databases do not themselves contain sequences. Rather, they contain IDs of sequences in other databases.

est - Contains ACCESSION numbers for sequences in the est_human, est_mouse and est_others databases. In practice, the est database is a virtual database containing the union of those three databases.

pdbnt - Contains ACCESSION numbers for only those sequences in the nt database that for which a 3D structure is found in PDB. Example: 1C2W_B - Chain B, 23s Rrna Structure Fitted To A Cryo-Electron Microscopic

When BLAST searches est or pdbnt, the search is redirected to the component databases. A search of searches est_human, est_mouse and est_others. A search of pdb searches only those sequences in the nt database that also have a structure in PDB.

>>> If you install est, you MUST also install est_human, est_mouse and est_others.
>>> If you install pdbnt, you MUST also install nt.





Please send suggestions of comments regarding this page to psgendb@cc.umanitoba.ca.