Magicblast - Hints

magicblast

Rationale:

Magicblast creates a list of reads from a read file that match a user-created BLAST database. The list can then be used to eliminate reads from the original read file, or to extract hits from the read file. Magicblast can create two types of files: files containing reads that match the database, files containing reads that do NOT match the database, or both files. This can be useful, for example, if you were sequencing a eukaryotic genome and wanted to eliminate contaminating reads from prokaryotic symbionts whose DNA co-purified with the host.

Where:

Run magicblast in the same folder as your read files. Before running magicblast, select read files from blreads. Groups of files can be processed as paried end files using File --> guesspairs.

Input:

Input read files may be gzipped or uncompressed. Both fastq and fasta files are supported. There MUST be a file extension to indicate which type of file is being read. For example, reads.fq, reads.fastq, reads.fq.gz or reads.fastq.gz are all acceptable names for a fastq file. reads.gz will be flagged as an unknown type of file.

Database - This may be either a BLAST database created using makeblastdb, or a fasta file. Searching a BLAST database is much faster than searching a fasta file. BLAST databases consist of several files eg. Zmays-mito.ndb, Zmays-mito.nhr, Zmays-mito.nin etc. Selecting any of the files for a database will select that database.

Magicblast Parameters:

See magicblast manual page.

SeqKit grep Parameters:

Extract reads to new file(s)?

Optionally, you can ask magicblast to create files containing subsets of reads:

None
write a matching reads file - Reads that match the database will be written to a file combining the readfile name with the name of the database, and a + to indicate matching. eg. If the readfile name was M400.fastq.gz, and the database was Zmays-mito, the file of matches would be M400.Zmays-mito+.fastq.gz
write a mismatching reads file - Reads that do NOT match the databse will be written using a - to inidcate mismatching eg. M400.Zmays-mito-.fastq.gz
both - both + and - files are written

Output:

Name for output directory - This defaults to the current directory. That makes sense, because the output from magicblast will be used for extracting reads from files in this directory.

The list of files in the current directory will be refreshed in blreads.