createSubProjectFakes.pl [options] <validationSummary.txt>
Options: -log <file> Log file (optional; default is createSubProjectFakes.pl.log) -warn <file> Warning file (optional; default is defined in gapRes.config) -h Detailed message (optional)
This is a wrapper program for the Gap Resolution sub system that is responsible generating the fasta and qual files of closed assemblies to be used as fake reads.
The following scripts (configurable in config file) must exist in the same path as validateSubProject.pl unless the path to the script is defined in the config file:
* shredFasta.pl The input file contains information about the validation statuses for each sub project directory. The format of the validationSummary.txt file are as follows, one per line with each item delimited by a tab.
1. sub project directory 2. PASS|FAIL 3. comment (optional) Each sub project directory must contain a validinfo.txt file (previously generated by validateSubProject.pl). Note that this file contains the validation status. However, the status defined in the input validationSummary.txt file supercedes the status within the validinfo.txt file. In other words, the status in validationSummary is the one that is used to determine whether or not to process the creation of fakes (status=pass) or to create primer files (status=fail) for each sub project.
The status within the validationSummary.txt file determines the course of action.
If the validation is successful, create fasta and qual files of the contig containing the anchors and place them in the sub project directory naming it <subProjectDirName>.a1 and <subProjectDirName>.a1.qual. Copy the fasta and qual files to the fakes dir in the current working directory.
The following files must exist in each sub project directory:
* validinfo.txt - create by validateSubProject.pl * 454AllContigs.fna - created by Newbler * 454AllContigs.qual - created by Newbler
The format of the primerinfo.txt file is as follows:
FASTA_TAG=<fasta tag name> TARGET_REGION=<start> <length> TEMPLATE=GDNA LEFT_PRIMER_NAME=<name of left primer> RIGHT_PRIMER_NAME=<name of right primer> EXCLUDED_REGION=<start> <length>
Note that the EXCLUDE_REGION key/value pair is optional. If used, more than one entry could be defined.
A default config file named gapRes.config residing in <installPath>/config is used to specify the following parameters:
(configurable)
createSubProjectFakes.trimClosedGapConsensusUsingAnchorPos=0 Specify if the closed gap consensus should be trimmed at the anchor position (0|1) createSubProjectFakes.trimConsensusSeqToThisManyBasesAwayFromAnchor=0 If trimming consensus of closed gaps is turned on, specify how many bases away from anchor position to keep. createSubProjectFakes.shredClosedGapConsensus=1 Specify if the closed consensus fasta and qual files should be shredded (0|1). createSubProjectFakes.shredClosedGapConsensusIfGreaterThanThisLength=2000 Specify the minimum length of the closed consensus sequence to be considered for shredding.
shredFasta.fragmentLength=1000 Specify the fragment length when shredding the repeat contig consensus. shredFasta.overlapLength=100 Specify the overlap length when shredding the repeat contig consensus.
(system configuration)
createSubProjectFakes.fakesDir=fakes Specify the name of the directory to store the fasta and qual files of the fakes. createSubProjectFakes.fakesFileExtension=.a1 Specify the file extension of the fakes. =head1 VERSION
$Revision: 1.10 $
$Date: 2010-01-06 22:00:39 $
Stephan Trong
S.Trong 2009/07/25 creation
S.Trong 2009/12/29 - added -log and -warn options.