adaptercheck

adaptercheck.py

adaptercheck.py searches fastq files for potential read-through contaminants in sequencing reads.

Where:

Run adaptercheck in the same folder as your read files.

Input:

Before running adaptercheck.py, select a single read file in blreads to process eg. DL300_R1.fastq.
The file must be an uncompressed fastq file.

Parameters:

Adapters to find - Choose an adapter file, or "custom adapter file"

adapter file - If custom adapter, choose a fasta file containing custom adapters

Size of 5' oligo - (default 12) It is often instructive to re-run adaptercheck.py with a few different oligo sizes. The frequency with which an oligonucleotide of k nt is expected to occur within a long sequence is given in the table below. For example, if a fastq file had reads totaling to 4 million nucleotides, an 11-mer would be expected to be seen once by random chance. k-mers from the adapter used in sequencing will usually occur several orders of magnitude more frequently than background k-mers.

k 4^k

10 1,048,576

11 4,194,304

12 16,777,216

13 67,108,864

14 268,435,456

15 1,073,741,824

16 4,294,967,296

17 17,179,869,184

18 68,719,476,736

If you using adaptercheck.py to determine which adapter was used for your reads, you may get hits above background for more than one adapter in the list. Use k=20, which will distinguish between adapters that differ within the first 20 nt.

WHERE TO SEND OUTPUT - By default, the output will pop up in a spreadsheet. If Text file is chosen send to an output file.

Output file name - file to send report to a spreadsheet, or to a .tsv file.

Output:

Output is written to the input directory.

k	4^k
10	1,048,576
11	4,194,304
12	16,777,216
13	67,108,864
14	268,435,456
15	1,073,741,824
16	4,294,967,296
17	17,179,869,184
18	68,719,476,736