digest.doc                                         update 10/29/90
                                     DIGEST

     I. Function- DIGEST reads in a file which lists restriction sites for one
     or more enzymes in a given fragment.  The user can ask DIGEST to 
     calculate the resultant fragments from multiple digests. Either parital
     or complete digests may be calculated.
     
     II. Program Flow
     Below is the screen output of a typical interactive session with DIGEST. 
     Program output and user responses are listed as they would actually 
     appear on the screen.  Comments, which are listed here for explanatory 
     purposes but would not appear in the program, are enclosed in the 
     symbols (* *). 

     (* program begins *)
     DIGEST      VERSION  5/15/90

     Enter input filename:               
     B:PUC18.RES                         (* IBM-PC DOS protocol *)
     Enter output filename:              
     PRN                                 (* IBM-PC DOS protocol*)

     _____________________________________________________________________
     DIGEST                     MAIN MENU
     _____________________________________________________________________
     Input file:        B:PUC18.RES 
     Output file:       PRN
     _____________________________________________________________________
                         1) Read in a new sequence
                         2) Open a new output file
                         3) Generate digests (output to screen)
                         4) Generate digests (output to file)
     _____________________________________________________________________
     Type the number of your choice  (0 to quit program)
     3
                    (* MAIN LOOP *)
     The names of enzymes with known sites in pUC18
     will be displayed a screenful at a time.
     You will be asked to specify enzymes one at a time
     to include in this digest.
     There are 10 enzymes listed.
     Press RETURN to begin.

     (* User presses RETURN and enzyme names appear on screen.
     User types numbers of enzymes one at a time and 0 when done.
     A "+" appears next to enzymes that have been added to the digest.*)

          1) Ava2      2) BamH1     3) Dra1       4) EcoR1      5) Hind3
          6) Hinf1     7) Kpn1      8) Pst1       9) Pvu2      10) Taq1

     Type number of an enzyme or 0 for more enzymes:
     10

          1) Ava2      2) BamH1     3) Dra1       4) EcoR1      5) Hind3
          6) Hinf1     7) Kpn1      8) Pst1       9) Pvu2      10)+Taq1

     Type number of an enzyme or 0 for more enzymes:
     9
          1) Ava2      2) BamH1     3) Dra1       4) EcoR1      5) Hind3
          6) Hinf1     7) Kpn1      8) Pst1       9)+Pvu2      10)+Taq1

     Type number of an enzyme or 0 for more enzymes:
     0
     
     Type C for complete digests, P for parital:
     C
     
     Type  D to generate a digest, Q to quit

     (* The user may repeat the cycle and print new digests by        *)
     (* by typing D.  Typing Q ends the program.                      *)
     (* Below is an example of what output might look like:           *)


     DIGEST            Version  5/15/90
     pUC18  Configuration:  CIRCULAR Length:           2686 bp
                                   # of
                                   Sites   Frags   Begin            End
     Taq1      TCGA                   4
     Pvu2      CAGCTG                 2
                                           1444     907Taq1        2350Taq1
                                            644    2351Taq1         308Pvu2
                                            276     631Pvu2         906Taq1
                                            182     449Taq1         630Pvu2
                                            110     309Pvu2         418Taq1
                                             30     419Taq1         448Taq1

     EcoR1     GAATTC                 1
     Pvu2      CAGCTG                 2
                                           2686     309Pvu2         308Pvu2
                                           2686     451EcoR1        450EcoR1
                                           2686     631Pvu2         630Pvu2
                                           2544     451EcoR1        308Pvu2
                                           2506     631Pvu2         450EcoR1
                                           2364     631Pvu2         308Pvu2
                                            322     309Pvu2         630Pvu2
                                            180     451EcoR1        630Pvu2
                                            142     309Pvu2         450EcoR1

     Kpn1      GGTACC                 1
     Pst1      CTGCAG                 1
     Hind3     AAGCTT                 1
                                           2643     443Kpn1         399Hind3
                                             27     416Pst1         442Kpn1
                                             16     400Hind3        415Pst1


     III. What the Output Means
     The column "# of Sites" tells how many sites are found for each enzyme.
     See note 2 below for the equations predicting the numbers of fragments in
     partial or complete digests.  The column "Frags" lists the fragments 
     produced by digestion with the given enzymes in descending order of size,
     as they would appear on a gel. The columns "Begin" and "End" list the
     positions of cut and the cutting enzyme on the corresponding 5' and 3'
     ends of each fragment in "Frags". 

     IV. Input file
     Generally, DIGEST uses output from INTREST or BACHREST directly as 
     input.  However, the user can even use DIGEST to predict restriction 
     fragments for molecules whose DNA sequence is not known.  If one knows 
     the restriction sites for several enzymes, it is possible to create a 
     datafile in the same format as the INTREST/BACHREST output.  Since 
     DIGEST doesn't allow too much deviation from the format of INTREST, some 
     parts may be omitted, as discussed below.  Here is a general formula for 
     the input file.  <Items enclosed in angle brackets> represent 
     information that must be supplied by the user. [Items enclosed in square 
     brackets] are optional. 


     <title line, may be blank>
     <seq.name> Configuration: <CIRCULAR or LINEAR> Length: <number> [bp]
     <title line, may be blank>
     <title line, may be blank>
     <enz.name> <recognition seq.> <number> [([<number>])]  <num. of sites>
     <site1>
     <site2>
     ......
     <siteN>
     <blank line>

     In constructing such a file, the following rules apply:

     1.  The first line is a title line and is ignored.
     2.  The sequence name may be up to 20 non-blank characters
     3.  All enzymes must include a name, a recognition sequence, a cutting
         site, and the number of sites found. These data items must all be on
         the same line, and must be separated by one or more blanks.
     4.  Restriction enzyme names must be ten or fewer characters. Blanks are
         not permitted.
     5.  If the recognition sequence is assymetric, (ie. the inverse 
         complement is not the same as the original site) a second cutting 
         site must be included after the first. This number must be enclosed 
         in parentheses, as in INTREST/BACHREST output. (Empty parentheses 
         are permitted if you're too lazy to include a number.) 
     6.  The positions of the sites found are listed below the enzyme, one 
         site per line.  The information in the columns "Frags","Begin", and 
         "End" may be omitted entirely. 
     7.  A blank line must separate each enzyme listing.

     V. Usage Notes
     1.  As many enzymes as you wish may be included in a given digest.  In 
         practice, this is usually not more than three.
 
     2.  The numbers of expected fragments in complete or partial digests of
         circular or linear molecules with n cutting sites are defined below:

  
                         circular                         linear
               -------------------------------------------------------------
              |                               |                             |
    complete  |  f   (n) =  n                 |  f   (n) = n + 1            |
              |   c,c                         |   c,l                       |
              |                               |                             |
              |-------------------------------+-----------------------------|
              |              2                |                          2  |
    partial   | f   (n) =   n                 |             (n+1) + (n+1)   |
              |  p,c                          |  f   (n) =  -------------   |
              |                               |   p,l             2         |
               -------------------------------------------------------------