|
blastdbkit.py -
Technical Description
|
update June 25, 2022
|
blastdbkit.py -
manual page
blastdbkit.py
- pydoc documentation
SUMMARY
blastdbkit.py is a Python script that performs the
tasks necessary to install and update a local copy of NCBI
Blast-formatted databases. As described in the manual page,
blastdbkit.py can
- add new
database files from an FTP mirror
- delete
local database files
- update
local database files by downloading from an FTP mirror
- generate
spreadsheet reports of disk usage and file dates in both
local and FTP databases, as a guide to management of a
local copy of NCBI databases
Critical
information regarding local copies of databases is found in
two BLASTDB.list files, the master copy in
$BIRCH/admin/BLAST, and the local copy in
$BIRCH/local/admin/BLAST. The local copy in generated from
the master copy, but the local copy lists in field 4 a '1'
for databases locally-installed, or '0' for those not
installed. This file is used to add a '+' or '-' to the
menus in birchadmin that perform Add, Delete or Update
functions.
After each
Add, Delete or Update operation, the following operations
are performed:
- the
local database is scanned to verify which databases are
actually installed, and the results written to
local/admin/BLAST/BLASTDB.list.
- BLAST
and FASTA menus for bldna and blprotein are updated to
reflect the local copies of the BLAST databases. For each
database, a .nam file is written to $BIRCH/dat/fasta,
listing the database files for each database. For example,
nt.nam contains the names of the nt database files nt.00,
nt.01, nt.02 etc.
- An email
message is sent, including the contents of
$BLASTDB/blastdbkit.log, which lists the files installed
or deleted, or whether an database was not installed
because it was already up to date.
blastdbkit.py
--reportlocal writes a spreadsheet report on the
locally-installed databases to $BLASTDB/localreport.tsv. This
file, which can be imported by any spreadsheet, lists the sum
of all files in a database, and the dates of the last update.
blastdbkit.py --reportftp writes a
spreadsheet report on database files at a remote FTP server
(eg. ftp.ncbi.nlm.nih.gov) to $BLASTDB/ftpreport.tsv. This
file, which can be imported by any spreadsheet, lists the sum
of all compressed files in a database, an estimate of total
uncompressed sizes, and the dates of the files on the FTP
server.
Please send suggestions of comments
regarding this page to psgendb@cc.umanitoba.ca