bl_seqkit_sample.py

bl_seqkit_sample.py

bl_seqkit_sample.py runs seqkit sample to create files with reads sampled from a larger file.

Where:

Run bl_seqkit_sample.py in the same folder as your read files.

Input:

RNAseq reads: Before opening this menu, select RNA readfiles. For paired-end reads, first use File --> guesspairs.py to group left and right read pair files, which will pop up in a new blreads window. Then run SeqKit sample from the new blreads window.

file extension of input files (eg. .fastq.gz): Of the files selected, only those with a specific file extension will be processed. Seqkit sample works on both Fasta and Fastq files.

Fastq: Typical extensions for Fastq read files are .fastq, .fastq,gzm .fq or .fq.gz. If the extension includes .gz, the file is compressed, and output will be written to a compressed file.
Fasta: There are a large number of file extensions in use for Fasta files. These include .fasta, .fn, .fsa, .fsp, .wrp, and others. Typically, Fasta files are not compressed, because most programs dealing with Fasta files do not have a mechanism to read compressed files.

prefix to prepend to file extension - Each output file will have the same name as the input file, with a prefix added before the file extension. For example, with the default value of _S5, if the input filename was D104.fq.gz, the output filename would be D104_S5.fq.gz. This prefex can be almost anything, but a few guidelines are suggested:

Begin the prefix with a special character such as "_" that will make it easier for other programs (eg. rename) to parse these filenames.
Keep the prefix short
Indicate that this file contains a sample. eg. "S" is short for sample.
A number should indicate the percentage setting used to generate the output. eg. 5 stands for the default percentage of 5%.

Percentage of sequences written to output - Because there may be many different types of files in a directory which have similar names, setting this to Yes narrows down the file names that bl_seqkit_sample.py has to consider. If you choose yes, you must also put in at least one file extension below.

Output:

The current blreads window will be refreshed to show the output files in the current directory.