guesspairs.py

guesspairs.py figures out which files should be paired together, representing left and right paired end reads for a given sequencing reaction. Before running programs that take sequencing reads as input (eg. stringtie, spades, rnaspades), run guesspairs.py to ensure read files are paired properly. Usually, you can simply select all files in the output from guesspairs.py, and then run the next program.

Where:

Run guesspairs.py in the same folder as your read files.

Input:

read files (eg. .fastq.gz): Before opening this menu, select files with the
read files, or you may simply select all files, and guesspairs.py will probably be able to make a correct guess.
 

uniquie string for left/right reads - This should be a short string that indicates whether a file contains left or right reads. Most Illlumina services will use R1 and R2, respectively, but anything that is unique for all left read files or right read files can be used.


Only process files with extension(s) -
Because there may be many different types of files in a directory which have similar names, setting this to Yes narrows down the file names that guesspairs.py has to consider. If you choose yes, you must also put in at least one file extension below.


file extension(s) (comma-separated) - If you chose Yes, type in a list of file extensions representing read files eg.
.fq.gz, .fastq would process only files with those file extensions.

Output:

Output will pop up in a new blreads window in which files containing paired-end reads are grouped in two columns. Files which cannot be paired are assumed to be single-end reads, and will appear in single columns. For most programs in BioLegato that take sequencing reads as input, you can probably select all files in the output window and then launch the next program.