Installing and Searching
Local BLAST Databases

May  8, 2016

BIRCH 3.20 introduces automated tools for installing and maintaining copies of NCBI Blast databases on your local machine. BIRCH includes a comprehensive set of tools for sequence analysis, which are run through the BioLegato family of graphical interfaces. As databases are installed, they are automatically integrated into BioLegato. As in previous versions of BIRCH, BLAST searches can also be run remotely at NCBI. BioLegato makes it easy for the users to seamlessly go from one task to the next, without the need to save and edit intermediate files.

Installing BLAST Databases

The BIRCH Administration Tool (birchadmin) makes it easy to install, update and delete local copies of BLAST databases. As illustrated at right, installation is as simple as choosing the FTP site from which you wish to download databases, and choosing the databases you wish to install, update or delete. Since downloads of large databases may take several hours, you can be notified by email when the install is complete.

To carry out these database management tasks,  birchadmin calls the Python script, which can also be run at the command line.

To help you determine which of your databases need to be updated, and how much disk space you need, the BIRCH Administration Tool can generate reports showing important information such as sizes of currently installed databases, sizes of databases available for download, local disk space usage, and modification dates for both local and remote database files. To make it easier to calculate the amount of disk space needed or available, these reports pop up as spreadsheets.

Local Database report
Report on databases available for download

Searching and Retrieval of Hits

The example below shows that local BLAST databases can be used seamlessly  as part of the BIRCH system. The goal is to create a phylogenetic tree of plant defensins, proteins activated as part of the response of plant cells to fungal attack. A collection of defensin proteins from a number of plant species were retrieved using the BIRCH blncbi program, which retrieves sequences by keyword query (see blncbi). Additional proteins are found by running TBLASTN on the local Non-redundant Nucleotide database. Output pops up in two windows: the report (left) and blnfetch (center) which can retrieve Blast hits directly to a new blprotein window. The dataset is further refined to give the final set of proteins. A multiple alignment of the proteins is done using MAFFT, followed by construction of a DNA alignment, based on the MAFFT alignment. Finally, a phylogenetic tree is constructed using the method of Maximum Likelihood.

More in-depth information on installing and maintaining local BLAST databases can be found at

Please send suggestions of comments regarding this page to