bl_seqkit_sample.py runs seqkit sample to create files with reads
sampled from a larger file.
Where:
Run bl_seqkit_sample.py in the same folder as your read files.
Input:
RNAseq
reads: Before opening this menu, select RNA readfiles.
For paired-end reads, first use File --> guesspairs.py to group left and right read
pair files, which will pop
up in a new blreads window. Then run SeqKit sample
from the new blreads window.
file extension of input files (eg. .fastq.gz):
Of the files selected, only those with a specific file extension
will be processed. Seqkit sample works
on both Fasta and Fastq files.
- Fastq: Typical extensions for Fastq read files
are .fastq, .fastq,gzm .fq or .fq.gz. If the extension
includes .gz, the file is compressed, and output will be
written to a compressed file.
- Fasta:
There are a large number of file extensions in use for Fasta
files. These include .fasta, .fn, .fsa, .fsp, .wrp, and
others. Typically, Fasta files are not compressed, because
most programs dealing with Fasta files do not have a
mechanism to read compressed files.
prefix to prepend to file
extension - Each output file will have the same name as the input file, with a prefix added before the file
extension. For
example, with the default value of _S5, if the input filename was
D104.fq.gz, the output filename would be D104_S5.fq.gz. This prefex
can be almost anything, but a
few guidelines are suggested:
- Begin the prefix with
a special character such as "_"
that will make it easier for
other programs (eg.
rename) to parse these
filenames.
- Keep the
prefix
short
- Indicate
that this file
contains a sample. eg.
"S" is
short for sample.
- A
number should
indicate the
percentage
setting used to
generate the
output. eg.
5 stands for
the default
percentage of
5%.
Percentage of sequences written to output
- Because there may be many different types of files in a
directory which have similar names, setting this to Yes narrows
down the file names that bl_seqkit_sample.py has to consider. If
you choose yes, you must also put in at least one file extension
below.
Output:
The current blreads window will be refreshed to show the output
files in the current directory.