update May 16, 2013
NAME
gstat.py - calculate protein statistics
SYNOPSIS
gstat.py infile [infile]DESCRIPTION
Calculates the following statistics from protein sequences in a
FASTA file
1.
the molecular weight of each protein.
2.
the theoretical pI value of each protein.
3.
the composition of each amino acid.
4.
the total number of amino acids in each protein
OPTIONS
Options:
-kda --- display molecular weights in kDa
-img --- display locus tags from IMG/ER FASTA files
-aacount --- perform an amino acid count for each sequence
-tsv --- presents the output in a TSV format
EXAMPLES:
gstat.py ecoli-k12.fsa
gstat.py -kda ecoli-k12.fsa >
data_save.csv
gstat.py -img ecoli-k12.fsa
ecoli-plasmid1.fsa
gstat.py -kda -img ecoli-k12.fsa
ecoli-plasmid1.fsa > data_save.csv
INPUT
Input is a file containing one or more proteins in FASTA format.
The
output of the program is CSV (tab delimited) via stdout.
The columns outputted are as follows:
NAME Mol.
Wt. pI
COMPOSITION SEQUENCE
IF you are using multiple files, this program
will print all
of the sequences in succession to standard
output.
NOTES
AUTHOR
Graham Alvare
Department of Plant Science
University of Manitoba
Winnipeg, MB Canada R3T 2N2
alvare@cc.umanitoba.ca
http://home.cc.umanitoba.ca/~alvare
ACKNOWLEDGEMENTS/CO-AUTHORS
Dr. Brian Fristensky - my work supervisor, and
the man who introduced
me to the wonderful field of bioinformatics.
Lukasz Kozlowski - bisection
Henderson-Hasselbach algorithm
for determining the theoretical pI of proteins
Kozlowski L. 2007-2012 Isoelectric Point Calculator.
http://isoelectric.ovh.org
Dr. Abby Perrill - amino acid pKa table
http://www.cem.msu.edu/~cem252/sp97/ch24/ch24aa.html
QUESTIONS & COMMENTS
If you have any questions, please contact me:
alvare@cc.umanitoba.ca
I usually get back to people within 1-2
weekdays (weekends, I am slower)
P.S. please also let me know of any bugs, or if
you have any suggestions
I am generally
happy to help create new tools, or modify my existing
tools to make
them more useful.
LICENSE
This code is licensed under the Creative
Commons 3.0
Attribution + ShareAlike license - for details
see:
http://creativecommons.org/licenses/by-sa/3.0/