Usage:   tacg -flag option -flag option ... outfile
tacg uses stdin/stdout/stderr; uses redirection or pipes for input and output;
needs input specifier (| or <); output to screen (default), >file, | nextcmd
Uses Knight's SEQIO for auto reformat on input; most ASCII formats accepted.
1 or more of: -F -g -G -l -L -O -p -P --rules --rulefile -s -S -X flags must 
be specified for output.
+-----+---------+-----------+-----+----------+-----------+----+-------+
|1    |    2    |    3      | 4   |    5     |     6     | 7  |   A   |
+-----+---------+-----------+-----+----------+-----------+----+-------+
|b    | dam     | i         | o   | rule     | silent    | X  |  ALL  |
|e    | dcm     | logdegens | O   | rulefile | tmppath   | #  | Pages |
|c    | example | l         | p   | r regex  | T         | ex |       |
|C    | f       | L         | P   | R        | v         |    +------ +
|clone| F       | strands   | ps  | raw      | V         |    |   i   |
|cost | g       | notics    | pdf | s        | w         |    +-------+
|D    | G       | numstart  | q   | S        | W slidwin |    | index |
|     | h help  | m/M       | Q   |          | x         |    |       |
|     | H HTML  | n         |     |          |           |    |       |
+-----+---------+-----------+-----+----------+-----------+----+-------+
   (leading dashes for flags above have been removed for brevity)
Type one of the numbers indicated to show the help page for that flag.
1-7 = page for those flags, 'A' = all pages, 'i' = index page, 'q' = quit
a
flag | opts  |  explanation (* = default; # = an integer)
-----+-------+--------------------------------------------------------------
-b   {#}       beginning of DNA subsequence; 1* for 1st base of sequence.
-e   {#}       end of DNA subsequence; 0* for last base of sequence.
-c             order output by # of hits per Enz, else by order in REBASE file.
-C   {0*-12}   Codon Usage table to use for translation:
               0 - Standard     5 - Ciliate_Mito      10 - Ascidian_Mito
               1 - Vert_Mito    6 - Echino_Mito       11 - Flatworm
               2 - Yeast_Mito   7 - Euplotid_Nuclear  12 - Blepharisma
               3 - Mold_Mito    8 - Bacterial
               4 - Invert_Mito  9 - Alt_Yeast
--clone {#_#,#x#..} find REs that don't cut in the range #_#, but do cut in
             the range #x#. Returns RE names & sites matching each criteria.
--cost {#}   use only REs that cost >= # units/$. Larger #s are cheaper.
             Requires optional data be added to rebase file; REs lacking this
             info are excluded.  #s>100 are cheap; #s<10 are v. expensive.
-D {0|1*-4}    controls input and analysis of degenerate sequence where:
          0  FORCES excl'n IUPAC degen's in sequence; only 'acgtu' accepted.
          1* cut as NONdegenerate unless IUPAC char's found; then cut as '-D3'.
          2  allow IUPAC char's; ignore in KEY hex, but match outside of KEY.
          3  allow IUPAC char's; find only EXACT matches.
          4  allow IUPAC char's; find ALL POSSIBLE matches.
1-7 = page for those flags, 'A' = all pages, 'i' = index page, 'q' = quit

flag | opts  |  explanation (* = default; # = an integer)
-----+-------+--------------------------------------------------------------
--dam          simulate cutting in the presence of Dam methylase (GmATC)
--dcm          simulate cutting in the presence of Dcm methylase (CmCWGG)
               for both above, REs not cutting due to methylation are listed as
               not cutting at all in summary.
--example {1-10} example code to show how to add your own flags and functions.
               Search for 'EXAMPLE' in 'SetFlags.c' and 'tacg.c' for the code.
-f   {0|1*}    form (or topology) of DNA - 0 (zero) for circular; 1 for linear. 
-F   {0*-3}    print/sort Fragments; 0*-omit; 1-unsorted; 2-sorted; 3-both.
-g   {Lo(,Hi)} prints a gel map w/ low cutoff of Lo; high cutoff of Hi bp.
               If Hi > Seq Length (or is omitted), Hi is set to Seq Length.
-G {#,X|Y|L}   streams numeric data to stdout for external analysis/plotting.
   # = bases/bin (the hits for this many bases should be pooled).
   X = bins on X axis; Y = bins on Y axis; L = Long output as 'bins(X) data(Y). 
-h (--help)    asks for (this) brief help page.
-H (--HTML) {0*|1}  complete (0) or partial(1) HTML tags generated on the fly
               for WWW output. See man page for appro usage.
               0 = makes standalone HTML page, with Table of Contents.
               1 = no page headers, only TOC, to embed in other HTML pages.

flag | opts  |  explanation (* = default; # = an integer)
-----+-------+--------------------------------------------------------------
-i (--idonly) {0|1*|2} controls output for seqs that have no hits.
   0 - ID line and normal output printed regardless of hits.
   1 (default) ID line and normal output are printed ONLY IF there are hits.
   2 - ONLY ID line is printed if there are hits.
-l             prints a GCG-style ladder map.
-L             print a Linear map - produces LOTS of output (~10x input).
--logdegens    all degens logged for graphic output (mem intensive).
               can be made more space efficient by using next 2 options.
--strands {1|2*} in Linear map, print 1 or 2 strands per line of DNA.
--notics       in Linear map, omit the tics that indicate 5, 10 bases.
--numstart {#} in Linear map, start numbering at this number (+ or -).
-m/M {#}       minimum (-m) and/or Maximum (-M) # cuts/RE; 0* for all.
-n {3*-10}     magnitude of recognition site; 3 = all, 5 = 5,6,7....

flag | opts  |  explanation (* = default; # = an integer)
-----+-------+--------------------------------------------------------------
-o {0|1*|3|5}  overhang - 5=5', 3=3', 0 for blunt, 1(d) for all.
-O {###(x),min}  ORF table for selected frames; 'x' -> extra info re AA compos'n
               ie: -O 135x,25 = fr 1,3,5 w/ xtra info; min ORF len = 25aas.
               'x' -> 3 extra lines x 128 chars wide, mungs formal FASTA format.
-p {Label,pattern[,Err]} cmd line entry of (degenerate) patterns to search for
               if Err is missing, it is set = 0, also sets -S for output.
  eg:  -pFindMe,gyrttnnnnnnngct,1  looks for indicated pattern with 1 error.
-P {Lab1,[+-lg]DistLo[-DistHi],Lab2}  Proximity match'g for 2 named patterns.
               Lab1/2 patterns must be in a REBASE-format file in form:
 'Lab1 1 IUPAC_pattern 0 Err !Comments '  where Err = max # errors allowed eg:
 'FindMe 1 gyrttnnnnnnngct 0 1 !The pattern that I'm trying to find
               can repeat to specify up to 10 relationships at once
               + (-) Label1 is downstream (upstream) of Label2; default either
               l (g) Label1 is < (>) or = to 'DistLo' from Label2
 'DistLo-DistHi'  indicates an explicit distance range (obviating l,g)
--ps           writes Postscript plasmid map (tacg_Map.ps),
               forces circular DNA, notes degens around rim, can be
               combined with -O (above) to plot ORFs in any frames. Will do
               multiple pages with a multi-sequence file.
--pdf          converts above PS plasmid map to PDF (tacg_Map.pdf) via exec
               of '/usr/bin/gs', so ghostscript (gs) needs to be there.
-q (default)   (quiet) DISallows sending diagnostic UDP info back to author.
-Q             (UNquiet) sends ~100 bytes of UDP data back to author

flag | opts  |  explanation (* = default; # = an integer)
-----+-------+--------------------------------------------------------------
--rule 'RuleName,((LabA:m:M&LabB:m:M)|(LabC:m:M&LabD:m:M))|(LabE:m:M),window'
               explicitly selects named patterns (<16) from a REBASE file with
               per pattern min/Max limits. 'RuleName' = name for the pattern
               len<11), 'window' = sliding window within which the rule must be
               true.  Parens () enforce logic; otherwise expressions are eval'd
               L->R. LabX = Pat name; 'm' = min; M = Max; '&' = logical AND;
               '|' = OR; '^' = XOR.  Enclosing single 'quotes' are REQUIRED.
               Valid patterns are logged to file 'tacg.patterns' in current dir
               for re-use. See manpage for info, examples
--rulefile '/path/to/rulefile'
              loads a series of complex rules like those described in --rule
              above.  Format is as in the single quotes above :
                 RuleName,(rulestring, as above),window
-r (--regex) {'Label:RegexPat'} search for RegexPat; use 'Label' for naming;
    translates IUPAC characters into std regex notat'n, escapes characters that
               require it:  gy(tt|gc)nc{2,3}m -> g[ct]\(tt\|gc\).c\{2,3\}[ca] .
NB: regex-matching is incompatible with regular IUPAC and matrix-matching.
-r (--regex) {'FILE:FileOfRegexPats'} open the FILE 'FileOfRegexPats' and search    
               for all Regex pat's in it; 'FILE' must be in CAPS to trigger 
               this behavior. NB: -r REQUIRES the ':' separator and single 
               quotes ' to enclose options.
-R {alt pattern file} specifies alternative REBASE file in GCG format
               OR use it to specify the MATRIX data file in TRANSFAC format.
--raw          ANYTHING on STDIN is raw sequence; same behavior as pre-SEQIO.
-s             summary - print Table of Zero Cutters, # hits of each Enzyme.
-S             Sites - prints the the actual cut Sites in tabular form.

flag | opts  |  explanation (* = default; # = an integer)
-----+-------+--------------------------------------------------------------
--silent       searches for possible SILENT RE sites (those that won't cause
           trans'n to change.  use -LT1,1 to see rev-trans sequence re-trans'd
           to verify that the seq is OK.  Arg, Leu, Ser codons will not RT/FT.
--tmppath  passes a temporary path for cooperation with CGIs that need output
           in a particular place. Currently used in plasmid map generation.
-T {[0*|1|3|6],[1|3]} Translates frames 1, 1-3, 1-6 w/ Linear Map using 1 or 3
               letter labels. -T3,3 xlates frames 1,2,3 with 3 letter labels.
-v             prints version of the program, then dies.
-V   {1-3}     Verbose/debug mode; spews diags to stderr (1=lots, 3=tons).
-w   {1|#}     output width in bp's (60 < # < 210), truncated to a # mod 15
               '-w 1' = 1 line output for easier parsing by external apps.
-W (--slidwin) {#} Defines the sliding window for searching for pattern groups
        use w/ '-x' as a looser alternative to -P (Proximity matching in pairs) 
-x {Label(,=),Label..(,C)} selects SPECIFIC REs (<=15) from the REBASE file;
               If ',=' is appended to 1 RE Label, it will tag that RE for the
               AFLP analysis.  See man page for details.  If ',C' is
               appended to a list of >1 Labels, requests a multiple digest.

flag | opts  |  explanation (* = default; # = an integer)
-----+-------+--------------------------------------------------------------
-X (--extract) {b,e,[0|1]} eXtracts the sequence around the pattern matched,
    from b bases preceding, to e bases following the MIDDLE of pattern (if a
    normal pattern, the START of the pattern if a regular expression. If the
    pattern is found in the bottom strand AND the last field = 1, sequence is
    rev-compl'ed before it's extracted so all patterns are in same orientation;
    if last field = 0, it is NOT reverse compl'ed.
-#   {#}       Matrix matching, using TRANSFAC-format matrix input. Required #
               is the matrix cutoff as a %.  Uses '-R' to specify alternative
               matrix file, if not 'matrix.data'. Also works with '-x'.

NB: matrix-matching is incompatible with regular pattern and regex matching.
ex:       tacg -f0 -n6 -T123,3 -sl -F3  < degen.input.file >output.file (to file)          
          tacg -f 0 -l  -n 5  -F 2 < input.seq.file  (to stdout/screen)
          tacg -m 3 -T1,1 -s  |grep HindIII < Ecoli_genome.genbank > out
          tacg --regex 'Rxname1:g(cc|tac)nr{2,4}yt  outfile
          tacg -x HindIII,EcoRV,bamhi,C -O 134,30 -w 90 -LlS < input.seq.file
for more help, try 'man tacg'. Type 'Ctrl+C' if the program seems locked.
Latest docs & code at: http://tacg.sf.net. Contact author:  hjm@tacgi.com