update March 2, 2021
NAME
Gblocks.py - Python wrapper script to run Gblocks from BioLegato.
SYNOPSIS
Gblocks.py infile [options]
DESCRIPTION
Gblocks removes
highly gapped or poorly conserved positions from a multiple
sequence alignment prior to phylogenetic analysis. Some of the
parameters in Gblocks ask for a specified number of sequences. For
example, -b1 asks for the minimum number of sequences for a
conserved position. This would have better been implemented as a
percentage. Therefore, the -b1 parameter for Gblocks.py takes a
percentage value, then calls the Gblocks binary with the
correct number of sequences, as determined from number of
sequences in the input file. Gblocks.py also does some sanity
checking to make sure that parameters don't conflict with each
other, before running Gblocks.
OPTIONS
Note: Not all command line options of Gblocks are implemented in this script.
infile - File containing a multiple sequence alignment in FASTA format. This must be the first parameter on the command line.
--t [p,d,c] - Type of sequence. p:protein (default), d: DNA, c:Codons
--b1=<integer> - Maximum percent of sequences for a conserved block. Default: 51.
--b2=<integer> - Maximum percent of sequences for flanking position. Default: 85.
--b3=<integer> - Maximum Number Of Contiguous Nonconserved Positions. Default: 8.
--b4=<integer >= 2> - Minimum Length Of A Block. Default: 10.
--b5=[n,h,a] - Allowed Gap Positions. n:none (default), h:half, a:all
--b6=[y,n] - Use similarity matrices (protein only). y:yes (default), n:no
--v=<integer > =50> - Characters per line in results and parameters files. Default: 60.
OUTPUT FILES
Gblocks writes two output files: a FASTA file with gappy or poorly conserved positions removed, and an HTML report with the complete alignment, and highlighting to show which positions were written to the FASTA files. Gblocks simply adds a file extension to the input filename to create the output files. For example, if the input file was a1.fsa, the alignment would be written to a1.fsa.fsa, and the HTML report would be written to a1.fsa.htm.
Gblocks - Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis [http://molevol.cmima.csic.es/castresana/Gblocks.html]
AUTHOR
Dr. Brian Fristensky
Department of Plant Science
University of Manitoba
Winnipeg, MB Canada R3T 2N2
frist@cc.umanitoba.ca
http://home.cc.umanitoba.ca/~frist