(* begin module version *)
version = 1.04  of libdef  1987 Feburary 10
                Last major changes:  1980 June 9


     LL        IIIIIIII  BBBBBBB   DDDDDDD   EEEEEEEE  FFFFFFFF
     LL           II     BB    BB  DD    DD  EE        FF
     LL           II     BB    BB  DD    DD  EE        FF
     LL           II     BBBBBBB   DD    DD  EEEE      FFFF
     LL           II     BB    BB  DD    DD  EE        FF
     LL           II     BB    BB  DD    DD  EE        FF
     LL           II     BB    BB  DD    DD  EE        FF
     LL           II     BB    BB  DD    DD  EE        FF
     LLLLLLLL  IIIIIIII  BBBBBBB   DDDDDDD   EEEEEEEE  FF


(* end module version *)
(* begin module page.0 *)
         ORGANISM and RECOGNITION CLASS LIBRARY DEFINITION:
         a DNA sequence data-base

         THOMAS SCHNEIDER,
         MCD Biology  University of Colorado, Boulder Colorado
         COPYRIGHT (C) 1987

    This document defines a way of organizing and storing the DNA sequence
 information from many organisms.  The library of sequences is structured
 so that a person can easily pull out sets of sequences for study.  The
 characteristics of these sequence sets, called "books", are also described.
 One can obtain a book using a language called Delila:  DEoxyribonucleic
 acid LIbrary LAnguage.  The structure of the library also allows
 easy updating and expansion as new sequences are obtained.

    Feel free to refer to the examples (pages 6).

    page 1.1         LIBRARY DEFINITION
    page 1.2         NOTES ON THE LIBRARY DEFINITION

    page 2.1         BOOK DEFINITION
    page 2.2         NOTES ON THE BOOK DEFINITION

    page 3           CATALOGUE DEFINITION

    page 4           LIBRARIAN

    page 5.1         DELILA INSTRUCTIONS
    page 5.2         NOTES ON THE DELILA INSTRUCTION DEFINITION

    page 6.1         EXAMPLE LIBRARY
    page 6.2         EXAMPLE CATALOGUE FOR HUMANS
    page 6.3         EXAMPLE CATALOGUE FOR THE LIBRARIAN
    page 6.4         EXAMPLES OF DELILA INSTRUCTIONS AND BOOKS
    page 6.4.1       EXAMPLE 1: lacZ TRANSCRIPT
    page 6.4.2       EXAMPLE 2: cI AND cro GENES
    page 6.4.3       EXAMPLE 3: GENE STARTS
    page 6.4.4       EXAMPLE 4: OTHER DELILA INSTRUCTIONS
    page 6.5         EXAMPLE OF AN AUXILARY PROGRAM: PARSE
    page 6.6         EXAMPLE OF AN AUXILARY PROGRAM: LISTER

    page a.1         APPENDIX 1:  DATA-BASING TECHNIQUES
    page a.2         APPENDIX 2:  IMPLEMENTATION
    page a.3         APPENDIX 3:  LIBRARY DESIGN PHILOSOPHY

    page to do 1     THINGS-TO DO AND QUESTIONS TO ANSWER
(* end module page.0 *)
(* begin module page.1.1.1 *)
         LIBRARY DEFINITION

         SCHEMA

    Below is an overview of the structure of the library.
         A-->>--B  means A has one or more of B.
         C--->--D  means C has one of D.

                             LIBRARY
                              :   :
                              V   V
                              V   V
                              :   :
                  ............:   :.............
                  :                            :
              ORGANISM                 RECOGNITION-CLASS
                  :                            :
                  V                            V
                  V                            V
                  :                            :
              CHROMOSOME                       :
               : : : :                         :
               V V V V                         :
               V V V V                         :
               : : : :                         :
   ............: : : :........                 :
   :       ......: :....     :                 :
   :       :           :     :                 :
  MARKER  TRANSCRIPT  GENE  PIECE......    ENZYME
   : :     :           :     : : :    :        :
   V V     V           V     : : :    V        V
   : :     :           :     : : :    :        V
   : :     :           :.....: : :    :        :
   : :     :...................: :    :        :
   : :...........................:    :        :
   :                                  :        :
  DNA                                DNA      RECOGNITION-SITE

    On the next three pages the structure of the library is defined.
 This is followed by explanatory notes.

 Words surrounded by pointed brackets (greater than and less than)
 are things to be defined.  We want to define the <library>.

 ::= means "is defined to be".

 . . .  means there can be none or more.

 Lines starting with an *, (eg. "* KEY NAME OF STRUCTURE") represent
 information stored in the library.
    The definition is in a modified Backus-Naur Form (BNF).

    One could read the definition as:
 "A library consists of a line of information which gives its dates of
 creation and title, and a set of organisms and recognition classes.
 An organism consists of a line with the capitalized word ORGANISM, an
 organism key, and a set of chromosomes . . ."
(* end module page.1.1.1 *)
(* begin module page.1.1.2 *)
         LIBRARY DEFINITION (continued)

    LIBRARY STRUCTURE

 <library>::=      [ * DATE OF CREATION, DATE OF SOURCE LIBRARY, TITLE
                   [   <organism>
                   [   <organism>
                   [   . . .
                   [   <organism>
                   [   <recognition class>
                   [   <recognition class>
                   [   . . .
                   [   <recognition class>

 <organism>::=     [ ORGANISM
                   [   <organism key>
                   [   <chromosome>
                   [   <chromosome>
                   [   . . .
                   [   <chromosome>
                   [ ORGANISM

 <chromosome>::=   [ CHROMOSOME
                   [   <chromosome key>
                   [   <marker>
                   [   <marker>
                   [   . . .
                   [   <marker>
                   [   <transcript>
                   [   <transcript>
                   [   . . .
                   [   <transcript>
                   [   <gene>
                   [   <gene>
                   [   . . .
                   [   <gene>
                   [   <piece>
                   [   <piece>
                   [   . . .
                   [   <piece>
                   [ CHROMOSOME

 <marker>::=       [ MARKER
                   [   <marker key>
                   [   <DNA>
                   [ MARKER

 <transcript>::=   [ TRANSCRIPT
                   [   <transcript key>
                   [ TRANSCRIPT

 <gene>::=         [ GENE
                   [   <gene key>
                   [ GENE
(* end module page.1.1.2 *)
(* begin module page.1.1.3 *)
         LIBRARY DEFINITION (continued)

 <piece>::=        [ PIECE
                   [   <piece key>
                   [   <DNA>
                   [ PIECE

 <DNA>::=          [ DNA
                   [ * a string of A, C, G, or T, 5' to 3'
                   [ * . . .
                   [ DNA

 <recognition
         class>::= [ RECOGNITION-CLASS
                   [   <recognition class key>
                   [   <enzyme>
                   [   <enzyme>
                   [   . . .
                   [   <enzyme>
                   [ RECOGNITION-CLASS

 <enzyme>::=       [ ENZYME
                   [   <enzyme key>
                   [   <recognition site>
                   [   <recognition site>
                   [   . . .
                   [   <recognition site>
                   [ ENZYME

 <recognition
          site>::= [ SITE
                   [ * a string of A, C, G, T, PU, PY or N, 5' to 3'
                   [ * with special characters (,/,),V,*,?
                   [ * . . .
                   [ SITE
(* end module page.1.1.3 *)
(* begin module page.1.1.4 *)
         LIBRARY DEFINITION (continued)

    LIBRARY KEYS

 <header>::=       [ * KEY NAME OF STRUCTURE
                   [ * FULL NAME OF STRUCTURE
                   [   <note key>

 <note key>::=     [ NOTE
                   [ * NOTES AND SPECIAL INFORMATION
                   [ * . . .
                   [ NOTE

 <organism key>::= [   <header>
                   [ * GENETIC MAP UNITS

 <chromosome
           key>::= [   <header>
                   [ * GENETIC MAP BEGINNING (REAL NUMBER)
                   [ * GENETIC MAP ENDING (REAL NUMBER)

 <reference
        to DNA>::= [ * KEY NAME OF PIECE WHERE FOUND
                   [ * GENETIC MAP BEGINNING ON COORDINATES (REAL NUMBER)
                   [ * DIRECTION (+/-) RELATIVE TO COORDINATES
                   [ * BEGINNING NUCLEOTIDE (INTEGER)
                   [ * ENDING NUCLEOTIDE (INTEGER)

 <marker key>::=   [   <header>
                   [   <reference to DNA>
                   [ * STATE (ON/OFF)
                   [ * PHENOTYPE

 <transcript
           key>::= [   <header>
                   [   <reference to DNA>

 <gene key>::=     [   <header>
                   [   <reference to DNA>

 <piece key>::=    [   <header>
                   [ * GENETIC MAP BEGINNING OF COORDINATES (REAL NUMBER)
                   [ * COORDINATE: CONFIGURATION (CIRCULAR/LINEAR)
                   [ * COORDINATE: DIRECTION (+/-) RELATIVE TO GENETIC MAP
                   [ * COORDINATE: BEGINNING NUCLEOTIDE (INTEGER)
                   [ * COORDINATE: ENDING NUCLEOTIDE (INTEGER)
                   [ * PIECE     : CONFIGURATION (CIRCULAR/LINEAR)
                   [ * PIECE     : DIRECTION (+/-) RELATIVE TO COORDINATES
                   [ * PIECE     : BEGINNING NUCLEOTIDE (INTEGER)
                   [ * PIECE     : ENDING NUCLEOTIDE (INTEGER)

 <recognition
     class key>::= [   <header>

 <enzyme key>::=   [   <header>

(* end module page.1.1.4 *)
(* begin module page.1.2.1 *)
        NOTES ON THE LIBRARY DEFINITION (refer to EXAMPLE LIBRARY)

 1) The dates and title allow one to know what version of the library one is
    using.  The first date is the date the library was created or modified.
    The second date is the date of creation of the previous version of this
    library.  The title is a short descriptive name of the library.

 2) Organisms are organized by taxonomy.  At present the organisms are
    simply ordered.  Later, they could be subdivided by kingdom, phylum, etc.

 3) If the entire DNA sequence of a chromosome is known, then that
    chromosome will have one piece.  If only parts are known, then
    each seperate known piece is recorded.

 4) Under all circumstances, only one strand of DNA is stored, and it
    is always 5' to 3'.  All RNA is stored like DNA, using T's (even
    RNA viruses).   DNA could be stored in binary format 2 bits per nucleotide.

 5) The various structures have a header with two names.  The KEY NAME
    (eg. "ECOLI") is provided for rapid access to the structure.
    The FULL NAME (eg. "ESCHERICHIA COLI K12") provides more information
    than the KEY NAME.

 6) The note keys are optional.  They could contain:
    a) the source of the information in the literature,
    b) locations of special modifications, etc.

 7) The GENETIC MAP keys allow one to relate the DNA to the genetic map.
    They also allow one to store markers that have not yet been sequenced.
    GENETIC MAP UNITS: The units with which one records genetic distances.
    These units can be real numbers, so all genetic positions are real numbers.
    GENETIC MAP BEGINNING/ENDING:  The genetic range of this chromosome.
    GENETIC MAP BEGINNING:  This number corresponds to the beginning
    of the coordinate system or the beginning of a structure such as
    a gene (see below).  This means that one need not know the exact
    relationship between base pair and genetic distances.

 8) Each piece has its own numbering system, a consecutive
    set of integers.  These numbers form a coordinate system relative
    to the genetic map.  Four pieces of information orient the coordinate
    system relative to the genetic map:

    a) the CONFIGURATION of the coordinate system refers to its topological
    shape.  It is either CIRCULAR or LINEAR.  A fragment of a chromosome
    is LINEAR, while a CIRCULAR piece is stored as a linear series of
    letters.

    b) The DIRECTION defines the orientation of the numbering system with
    respect to the genetic map.  + means "in the same direction as".

    c,d) The BEGINNING and ENDING are integers which specify the limits of
    the coordinate system.  The value of ENDING is always greater than
    that of BEGINNING.

    The coordinate system provides a framework for stating the exact
    numbering of the bases in the DNA of the piece.  This also requires
    four pieces of information: configuration, direction, beginning and
    ending, all relative to the coordinate system.
(* end module page.1.2.1 *)
(* begin module page.1.2.2 *)
         NOTES ON THE LIBRARY DEFINITION (continued)

 8) (continued)
    Structures, such as genes, which refer to a piece using a reference
    also use the coordinate system.  In this way, these structures can point
    to stretches of DNA within a piece.

    One consequence of this system is that a circular piece may be
    stored in the library "rotated" so that the cut in the sequence (to
    allow linear storeage) may be anywhere on the circle.

    Since the numbering is consecutive, numbering systems which have no
    zero are modified by adding 1 to all negative numbers.  This creates a
    zero and allows insertion into the library.

 9) The marker, when applied to a piece (see instructions for how
    to do it) deletes the bases between (but not including) its BEGINNING and
    ENDING, and then inserts its DNA inbetween.  Reference numbers one base
    outside the coordinate system may be used to modify DNA just at the edge of
    the coordinates.  Markers will handle all forms of deletion, insertion,
    mutation and splicing!!  In the library, all markers are OFF, meaning that
    the marker was not applied to the DNA in the piece.  When a marker is
    applied to the DNA its state will be ON.  If a marker does not refer to any
    piece, then its NAME OF PIECE WHERE FOUND will be NONE.  The
    marker PHENOTYPE allows one to store the effects of the marker.  The
    PHENOTYPE is like a key name:  it is short to allow rapid searches.

 10) For the purposes of this library, a gene is considered to be the
    DNA limits of a translated region of a transcript (message, mRNA, or
    "gene product").  Every gene has a transcript and is translated to
    protein.  As a result, tRNA is considered to be a modified transcript,
    not a gene product.
    A "transcript" refers to DNA which codes for RNA, and
    a "gene" refers to DNA which codes for protein.

 11) The gene limits (BEGINNING to ENDING) go from the A of the ATG (or
    G of GTG) to the third base of the stop codon.

 12) The recognition class allows storeage of restriction enzyme
    recognition sites (eg. ECORI, HAEIII), insertion sequence recognition
    sites or other sites recognized by enzymes.  Each enzyme has a set of sites
    that it can recognize.
         PU = purine = A or G
         PY = pyrimidine = C or T
         N = any base = A or C or G or T
         * = next base is modified
         V = cleaveage point of enzyme
         ? = site sequence is unknown
         (X/Y) = base X or base Y is recognized

 13) The first letter of each line in the library will be one of
        "*",
        "O","C","M","T","G","P","D"
        "R","E","S",
        "N",
    These letters make the library into a tree.   Note that the nucleotides are
    completely segregated from these letters or words.   If the DNA is stored
    2 bits/NT, then DNA sections of the library would not have lines, although
    other parts of the library still could.
(* end module page.1.2.2 *)
(* begin module page.1.2.3 *)
         NOTES ON THE LIBRARY DEFINITION (continued)

 14) The tree structure of the library allows automatic checking for proper
    library format.  The structures which are surrounded by letters (see
    the previous note, not including "*") can each be read by individual
    subroutines.  The first letter triggers the subroutine call, and the last
    letter triggers return from the subroutine.  With this convention, access
    to any of the data is simple. (see appendix 1.)

 15) Structures have an order from highest (largest) to lowest (smallest).
    For example, in order from superstructure to substructure one has:

                      SUPERSTRUCTURE
                           A
                           :     ORGANISM     :
                           :    CHROMOSOME    :
                           :      MARKER      :
                           :       DNA        :
                                              V
                                         SUBSTRUCTURE

    Organsim is a superstructure to (is larger than), chromosome and so on.
    The entire order is best seen in the schema.

 16) There are three ways to store information about DNA:

    a) Storeage of a DNA sequence (piece's DNA).
    b) Storeage of a change to a DNA sequence (marker's DNA).
    c) Storeage of the recognition of a DNA sequence (enzyme's site).

    These are represented by the three lowest leaves of the schema.
(* end module page.1.2.3 *)
(* begin module page.2.1 *)
         BOOK DEFINITION

    A book is a subset of the information contained in the library.
 A book is requested by a user who then writes programs to analyze
 the data contained in the book.

 <book>::=         [ * DATE OF WITHDRAWAL, DATE OF LIBRARY CREATION, BOOK TITLE
                   [   <organism>
                   [   <organism>
                   [   . . .
                   [   <organism>
                   [   <recognition class>
                   [   <recognition class>
                   [   . . .
                   [   <recognition class>
(* end module page.2.1 *)
(* begin module page.2.2 *)
         NOTES ON THE BOOK DEFINITION (refer to EXAMPLE BOOK)

 1) The first line of the book allows one to identify the book, and can
    be used as a header for programs that use books.  The dates are
    supplied by the librarian automatically, while the BOOK TITLE is
    written by the user.  This line is exactly the same as the first
    line of a library. (see appendix 3.)

 2) The DATE OF LIBRARY CREATION allows one to keep track of the source
    library, as new versions of the library are made.  The DATE OF WITHDRAWAL
    allows one to distinguish books with the same title made a few
    moments apart.

 3) Organism and recognition class have the same definition in both
    the library and a book.  This means that
    a) any part of the library can be put into a book,
    b) programs that access books can have the same structure as those
       that access the library.  (However, see Appendix 1.)

 4) A book will only contain the requested DNA sequence, all other
    information from the library will be suppressed.  This means that if
    one requests only a fragment of a piece from the library, then in
    the piece key of the book, the configuration will always be linear, and the
    nucleotide limits of the piece will NOT be the same as those in the
    library, since they will be the limits requested.  The coordinate system,
    however, will be copied faithfully from the library to the book.

 5) A linear fragment of a circular piece will have a
    discontinuity in its numbering system if the fragment lies over the
    boundary of the coordinate system.

 6) A book is formally identical to a library.  This allows one to
    create "sublibraries".  The only restriction to this is that duplicate
    names at the same level in the tree are not allowed.  (see page 3.1
    2b, 7 and also appendix 3).
(* end module page.2.2 *)
(* begin module page.3.1 *)
         CATALOGUE DEFINITION (refer to EXAMPLE CATALOGUES)

    The purpose of the catalogue program is to check the library,
 integrate new structures into a new library made from the old, and
 to record the structure of the new library in catalogues for both
 people and the librarian program (see LIBRARIAN).

                                     :-------> NEW LIBRARY
                      CATALOGUE      :
     OLD LIBRARY -------- * -------->:-------> NEW CATALOGUE FOR LIBRARIAN
                       PROGRAM       :
                                     :-------> NEW CATALOGUES FOR HUMANS

 The catalogue program:

 1) Reads an "old" library, which has new insertions and modifications in it.

 2) Checks:
    a) the library format: that tree structure and variable types are correct.
       (see Library Definition Notes)
    b) for duplicated key names among the direct substructures
       of each structure, and flags duplicates as incorrect (or changes their
       names to avoid duplications, see item 7 below.)
    c) that genetic markers are within the range of the chromosome.
    d) that marker BEGINNING is not equal to its ENDING.
    e) that transcript direction is consistant with the transcript BEGINNING
       and ENDING nucleotides in linear pieces of DNA.
    f) that gene limits match the DNA code:
       BEGINNING must be ATG or GTG,
       ENDING must be TAG, TGA, or TAA.
       (A check that the number of bases is a multiple of three can be done
       as long as splicing by markers in eukaryotes is taken into account.)
    g) that a piece exists when it is refered to by markers, genes
       or transcripts.
    h) that each coordinate system BEGINNING has a numerical value less than
       the coordinate system ENDING.
    i) that the number of bases implied by the coordinate system and the
       piece is exactly the number found.
    j) that LINEAR coordinate systems have only LINEAR pieces of DNA.
    k) that DNA contains only A, C, G, T
    l) that sites contain only A, C, G, T, PU, PY, V, ?,
       * (and the * is followed by at least one more base), and
       (X/Y) where X and Y are members of A, C, G, T.

 3) Produces a new library with the DNA and site data compactly reformatted.
    This feature integerates new inserted sections of the library with
    the old sections. It may also rearrange the storage of items for
    efficient retrieval.  The dates are adjusted to show the time of creation
    of the new library, and the source library.
(* end module page.3.1 *)
(* begin module page.3.2 *)
         CATALOGUE DEFINITION (continued)

 4) Produces a new catalogue for the librarian (see later) which
    a) is structured in the "outline" form of the library,
    b) contains only the key name, since it is needed to identify a structure.
    c) contains the locations of those structures in the new library
       (eg. file and lines from top of file if the organisms are
       spread over several computer files).
    d) is condensed so that each line (record) contains one "item":
       the opening key symbol (eg O for ORGANISM), name, file and file line.
       This allows rapid scanning of the catalogue.

 5) Produces a new catalogue for humans which
    a) is structured in the "outline" form of the library,
    b) may optionally eliminate some sections of information.
       If listed, DNA should have tick marks and locations noted.
    c) may be indented and paged for easy reading.

 6) Could produce an alternative catalogue in which the structures are
    grouped by their kind (all organism references together, all
    chromosome references together, etc.).  This could be alphabetized
    so that one could easily find items by hand.  A reference to the
    superstructure name would be useful.

 7) A special option of the catalogue program allows one to turn
    books into libraries:
    a) the dates are not modified, since the book was presumably just created.
    b) duplicate names are eliminated.  This requires a set called a
       'family': those items that refer to a piece and the piece
       itself.  The catalogue program considers all items within a chromosome
       prior (in the book) to each piece (and refering to the piece)
       as the family of that piece.  In this way it can decide how to
       rename a duplicate family name.
(* end module page.3.2 *)
(* begin module page.4.1 *)
         LIBRARIAN

    The librarian program takes a set of instructions and creates a book
 from the library.  The librarian uses the catalogue to quickly find
 the location of the requested keys.  The actual DNA nucleotides are then
 determined from the keys in the library.

     LIBRARY ----------->:                    :-------> BOOK
                         :     LIBRARIAN      :
     CATALOGUE --------->:-------- * -------->:
                         :      PROGRAM       :
     INSTRUCTIONS ------>:                    :-------> INSTRUCTION LISTING


         INSTRUCTIONS TO THE LIBRARIAN

 1) give a title to the book.  (See the definition
    of book).  Default:  none.  Date of withdrawal is still given.

 2) have a provision to suppress copying of note keys, (and other
    keys) to the book.

 3) Specify the various structure keys such as organism, chromosome,
    marker, etc. to be placed in the book.

 4) specify the pieces of DNA, fragments of pieces or enzyme sites to be
    placed in the book.

 5) provide a method for numbering the items in the book.


    The INSTRUCTION LISTING has two parts:

    The first lists the instructions in the form they are found in the
    INSTRUCTIONS, and indicates any errors found.

    If there are no errors, then
    The second part lists the instructions, but also
    indicates the numerical values of all the keys and values used in the
    instructions.  This allows one to check that one got what one wanted,
    or to find out the location of a structure.   It also will list
    any errors that arise while the parts of the book are being found,
    and the book is being printed.
(* end module page.4.1 *)
(* begin module page.5.1.1 *)
         DELILA INSTRUCTIONS (refer to EXAMPLE DELILA INSTRUCTIONS)

     The DEoxyribonucleic acid
           LIbrary
             LAnguage

                    HUMAN
    CATALOGUE ------- * -------> DELILA INSTRUCTIONS


    As indicated by this diagram, the human is an important link
    in the creation of a book (but that's only natch!).  One looks in the
    catalogue to find the items one is interested in, then one writes a
    set of Delila instructions and runs the librarian to obtain a book
    containing them.

    The BNF for instructions to the library is hard to read, but precise.

    <PART> is a part of the instruction language, whose definition can
         be found elsewhere in the BNF.

    ::= means "is defined to be".

    | means "or".

    words without brackets are part of the instruction language,
    taken literally.

    All parts of <DELILA INSTRUCTION SET> are seperated by <BLANK>, although
    this is not shown explicitly.  Anywhere where there may be a <BLANK>, one
    could have a <COMMENT>.


         DELILA INSTRUCTION SET DEFINITION (BNF)

    <BLANK> is a blank character or a comment
    <EMPTY> is an option to leave out the part

    <DIGIT>::=0|1|2|3|4|5|6|7|8|9
    <LETTER>::=A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z
    <SIGN>::=+|-
    <SYMBOL>::=.|,|:|?|@|#|&|$|*|/|=|(|)
    <CHARACTER>::=<DIGIT>|<LETTER>|<SIGN>|<SYMBOL>|<BLANK>

    <NUMBER>::= <DIGIT>|<NUMBER><DIGIT>
    <SIGNED NUMBER>::=<SIGN><NUMBER>

    <LETTER DIGIT>::=<LETTER>|<DIGIT>
    <LETTERS OR DIGITS>::=<LETTER DIGIT>|<LETTER DIGIT><LETTERS OR DIGITS>
    <IDENTIFIER>::=<LETTER>|<LETTER><LETTERS OR DIGITS>

    <STRING>::=<CHARACTER>|<CHARACTER><STRING>
    <QUOTE STRING>::="<STRING>"|'<STRING>'

    <KEY>::=<IDENTIFIER>

    <COMMENT>::=(* <STRING> *)
(* end module page.5.1.1 *)
(* begin module page.5.1.2 *)
         DELILA INSTRUCTION SET (continued)


    <DELILA INSTRUCTION SET>::=<BOOK TITLE><SET OF INSTRUCTIONS>
      <BOOK TITLE>::=<EMPTY>|TITLE<QUOTE STRING>;

      <SET OF INSTRUCTIONS>::=<INSTRUCTION>|<INSTRUCTION><SET OF INSTRUCTIONS>
        <INSTRUCTION>::=<SPECIFICATION>|<REQUEST>|<IF>
                        |<DEFAULT RESET>|<NOTE INSERTION>


    <SPECIFICATION>::=<STRUCTURE><KEY>;
      <STRUCTURE>::=<ORGANISM STRUCTURE>|<RECOGNITION STRUCTURE>
        <ORGANISM STRUCTURE>::=ORGANISM|CHROMOSOME|MARKER|TRANSCRIPT|GENE|PIECE
        <RECOGNITION STRUCTURE>::=RECOGNITION-CLASS|ENZYME


    <REQUEST>::=GET<RANGE><DIRECTION><WITH>;
      <RANGE>::=<POSITIONS>|<ALL>|<EVERY>

        <POSITIONS>::=FROM<POSITION>TO<POSITION>
          <POSITION>::=<ABSOLUTE><RELATIVE>

            <ABSOLUTE>::=<KEY POSITION>|<NUMBER>
              <KEY POSITION>::=<ABSOLUTE OBJECT><LIMIT>
                <ABSOLUTE OBJECT>::=COORDINATE|<SUBSTRUCTURE>|<RESPECIFICATION>
                  <SUBSTRUCTURE>::=MARKER|TRANSCRIPT|GENE|PIECE
                  <RESPECIFICATION>::=(<PIECE POINTER><KEY>)
                    <PIECE POINTER>::=MARKER|TRANSCRIPT|GENE
                <LIMIT>::=BEGINNING|ENDING

            <RELATIVE>::=<SIGNED NUMBER>|<EMPTY>

        <ALL>::=ALL<STRUCTURE><CUT>
          <CUT>::=<SITE>|<EMPTY>
            <SITE>::=CUT<POSITION>

        <EVERY>::=EVERY<STRUCTURE>

      <DIRECTION>::=<DIRECTION REQUEST>|<EMPTY>
        <DIRECTION REQUEST>::=DIRECTION<DIRECTION VALUE>
          <DIRECTION VALUE>::=<SIGN>|COMPLEMENT|HOMOLOGOUS|<SUBSTRUCTURE>

      <WITH>::=<EMPTY>|WITH(<MARKERS>)
        <MARKERS>::=<MARKER>|<MARKER><MARKERS>
          <MARKER>::=<KEY>
(* end module page.5.1.2 *)
(* begin module page.5.1.3 *)
         DELILA INSTRUCTION SET (continued)

    <IF>::=IF<CONDITION>THEN<INSTRUCTION>ELSE<INSTRUCTION>
      <CONDITION>::=<FUNCTION><RELATIONSHIP><NUMBER>
        <RELATIONSHIP>::=<|<=|=|>=|>|<>
        <FUNCTION>::=<SIZE>
          <SIZE>::=SIZE(<OBJECT>)
            <OBJECT>::=<POSITIONS>|<ALL>


    <DEFAULT RESET>::=DEFAULT<DEFAULT TYPE>;
      <DEFAULT TYPE>::=<KEY DEFAULT>|<RECOGNITION SITE DEFAULT>
                      |<RANGE DEFAULT>|<NUMBERING DEFAULT>
      <STATE>::=ON|OFF

        <KEY DEFAULT>::=KEY<DISPLAY TYPE><STATE>
          <DISPLAY TYPE>::=NOTE|MARKER|TRANSCRIPT|GENE

        <RECOGNITION SITE DEFAULT>::=SITE<SITE OPERATION><STATE>
          <SITE OPERATION>::=EXPAND|MODIFY|CLEAVE

        <RANGE DEFAULT>::=OUT-OF-RANGE<RANGE ACTION>
          <RANGE ACTION>::=REDUCE-RANGE|CONTINUE|HALT

        <NUMBERING DEFAULT>::=NUMBERING<NUMBERING OPERATION>
          <NUMBERING OPERATION>::=<STATE>|<STRUCTURE SET>|<SIGNED NUMBER>
            <STRUCTURE SET>::=<STRUCTURES>|<ALL>
            <STRUCTURES>::=<STRUCTURE>|<STRUCTURE><STRUCTURES>


    <NOTE INSERTION>::=NOTE<QUOTE STRING>
(* end module page.5.1.3 *)
(* begin module page.5.2.1 *)
         NOTES ON THE DELILA INSTRUCTION DEFINITION
              (refer to EXAMPLE DELILA INSTRUCTIONS)

 1) A <DELILA INSTRUCTION SET> consists of a series of instructions
    which tell the librarian how to make a book.  Two types of instructions
    play a major role in Delila:

    a) Specification:  this instruction allows one to move around the
    library tree.  One starts at the top ("LIBRARY" in the schema), and
    moves by stating the type of structure one wants and its name.  Example:
    ORGANISM ECOLI;  One can only move one step down at a time, although
    one can move up any amount at one time.  This means that some movements are
    not allowed, these are called Illegal Tree Traversals.

    b) Request:  this instruction is a request for an item or piece.
    There are many ways to do this, so requests are complicated.  There are
    three types:
         GET FROM <POSITION> TO <POSITION>;
         GET ALL <STRUCTURE>;
         GET EVERY <STRUCTURE>;

    Where <POSITION> is somewhere on the DNA.


    Three instructions of lesser importance are:

    c) IF:  which allows Delila to make choices.

    d) Default resets:  There are a number of variables which can be set by
    the user.  They all start out with values that I like, but if you don't
    like that, you can change them.

    e) Note insertion: The user may add notes to any node of a book
    with this command.

 2) The <BOOK TITLE>, which is optional, defines the book title.
    The title must be in quotes.  Only the first typed line of the title
    is inserted into the book.   See the Book Definition.
    If one does not use this instruction, then there will be no book title.

 3) Each <DELILA INSTRUCTION> is seperated from other instructions by
    semicolons;  this allows a flexible format.
(* end module page.5.2.1 *)
(* begin module page.5.2.2 *)
         NOTES ON THE DELILA INSTRUCTION DEFINITION (continued)

 4) <SPECIFICATION>
    The rules of specification are:
       (refer to notes on Library Definition, structures.)
    a) Specifying a structure allows access to the keys of that
       structure, for the purpose of requesting a piece or enzyme.
    b) Specifying a structure makes its substructures unspecified.
    c) Since a structure is specified by key name, all superstructures
       must have been previously specified, or there is an illegal
       tree traversal error.
    d) If a chromosome has been specified, then all those structures
       pointing to the piece can be specified independently of each
       other.  This allows one to use several structures to refer
       simultaneously to the same piece.
    e) When there can only be a single substructure to a structure, then that
       substructure is automatically specified.
       Examples: DNA of marker and piece of marker.

    An exception to item c:  one does not always need to specify the piece of
    DNA of markers, genes, or transcripts, since specification of their
    names will identify the piece through the reference to DNA.  In the
    same way, one never specifies the DNA of markers and pieces or the
    sites of enzymes, since these are intimately linked together.  So
    specification of a recognition class and then several enzymes is all
    that is needed to get these structures, since each enzyme may have
    several recognition sites, all of which must be copied to the book.

    Specification of a marker will not cause its DNA changes to be made,
    and the STATE will be recorded as OFF in the book.  See <WITH> in the
    <REQUEST> instruction for ON markers.


    WHAT KEYS ARE PRINTED IN THE BOOK AND WHEN:
      The organism, chromosome, marker, transcript, gene, recognition
    class and enzyme keys are printed in the book when a <SPECIFICATION>
    instruction is executed.  (Respecification, which is specification within
    a request, will not cause any keys to be printed.)  The only exceptions
    to this are that the piece key is printed by the <REQUEST> command (GET,
    see next section), along with the DNA, and that ON markers are always
    printed, since the marker key contains information on what happened to
    the numbering system of the piece when the marker was applied.

       Marker, transcript or gene keys could also be printed as a result of a
    GET command, when a piece has been specified.  In this case, all markers,
    transcripts and genes which intersect the fragment of the piece requested
    will be printed in the book (unless turned off by the default key, see
    later) followed by the fragment requested.

    In the same way, a request of a transcript will print all genes and
    markers which are in the same direction as the transcript unless they
    are turned off.

        A GET ALL <STRUCTURE> command will print all substructure keys in the
    book (unless turned off by the default key).

        A GET EVERY <STRUCTURE> command will print every example of <STRUCTURE>
    at the current level of specification (unless turned off).
(* end module page.5.2.2 *)
(* begin module page.5.2.3 *)
         NOTES ON THE DELILA INSTRUCTION DEFINITION (continued)

 5) <REQUEST>
       Once a structure has been specified, requests for fragments of DNA
    or structures (such as an organism) can be made.  Each request will print
    in the book a piece (fragment), enzyme or structure as defined in
    the library.  The DNA is always given 5' to 3'.  This means that
    complementary sequences can be obtained simply by switching the first and
    last limits requested: the librarian will provide the complement 5' to 3'.
    (For a circular chromosome, one would also switch the direction.)

       There are three kinds of requests (three kinds of <RANGE>):
    a) GET <POSITIONS> allows one to chose the fragment of DNA.
       <ABSOLUTE> is an absolute fixed location on the DNA to which may be
       added a <RELATIVE> offset.  The piece will be LINEAR.
       One can ask for four kinds of absolute locations.
       i. one may specify the number of a base;
       The other three allow reference by beginning or ending
       (a limit) of an object:
       ii. the coordinate system;
       iii. a marker, transcript, gene or piece limit, previously specified;
       iv. a marker, transcript or gene not previously specified.  This is
       a RESPECIFICATION, and it will not change any of the current
       specifications.

    b) GET <ALL> prints the entire structure in the book.  All substructure
       keys are also printed in the book (unless turned off by <OPTION>).
       Note that even though they are printed they need not have been
       specified.  The option <CUT> applies to circular pieces of DNA
       ONLY, and indicates where the DNA is to be cut.  The cut
       defines the first nucleotide provided to the book.  The piece is still
       CIRCULAR, but has been rotated.  The default cut site is the one
       specified in the library.  If the piece is circular, then
       GET ALL PIECE; will always give a CIRCULAR fragment in the
       book (even though it is "cut").

    c) GET <EVERY> will print every occurance of the structure in the book.
       The direct superstructure must have been specified.

    The <DIRECTION> is relative to the coordinate system of the piece,
    and defaults to the direction of the structure.  The DIRECTION is redundant
    with the values for BEGINNING and ENDING, except in the case of CIRCULAR
    chromosomes, where the limits are ambiguous without more information.
    It only applies to <POSITIONS> and <ALL>.  COMPLEMENT allows the
    complement of a structure to be obtained without knowledge of the
    orientation of the structure.  For consistancy, HOMOLOGOUS may be
    stated, but it is the default.  For GET <POSITIONS><DIRECTION REQUEST> the
    direction is that of the library piece, and GET ALL <STRUCTURE><DIRECTION
    REQUEST> is relative to the <STRUCTURE>.  GET <RANGE> DIRECTION
    <SUBSTRUCTURE> will always take the direction from the <STRUCTURE>, which
    must have been specified.
(* end module page.5.2.3 *)
(* begin module page.5.2.4 *)
         NOTES ON THE DELILA INSTRUCTION DEFINITION (continued)

 5) <REQUEST> (continued)
    The <WITH> option allows one to apply a set of markers (mutations) to
    a piece.  All those markers named using <WITH> will be printed
    in the book BEFORE the piece is printed.  Their state will be
    printed as ON, meaning that the changes that they indicate will be
    done to the DNA of the piece.  The GET<ALL> request can accept <WITH>
    only for transcripts, genes, and pieces.  Overlapping markers can not
    be constructed, since one could not get recombinants between them.  The
    markers are printed in the order of the piece requested.
    <WITH> can not be used with <EVERY>.

 6) <IF> allows one to measure distances between positions or the size of a
    piece and make a decision based on the distance.
    It works like a standard if-then-else in PASCAL EXCEPT that the THEN and
    ELSE are seperated by a ";".

    (Later there will be other conditions available.)

 7) The <DEFAULT RESET> instruction has four types:
    a) The <KEY DEFAULT> allows one to suppress certain substructure keys
       of the library: note, marker (OFF), transcript and gene.  One is NEVER
       allowed to turn off any other keys since these are essential pieces of
       information.  So markers listed in a <WITH> (ON markers) can not be
       turned off.  Note that specification and then request of a structure
       WILL NOT override suppression of the key of that structure: the
       piece refered to by the structure will be printed, even
       though the structure key will not be.
       The effect is that the key will be read from the library, but it
       will not be printed in the book.

    b) <RECOGNITION SITE DEFAULT>: Although enzyme recognition sites are stored
       in a condensed notation, there may be times when one wants only ACGT
       notation.  To allow this, the EXPAND option can be turned ON.
       This will increase the number of recognition site structures printed
       in the book.  When ON, the MODIFY operation will print an "*" in front
       of modified bases.  When ON, the CLEAVE operation will print a "V" at
       the cleaveage point.

    c) If the request is out of the range of the piece, then the
       <RANGE DEFAULT> applies.  A REDUCE-RANGE will reduce the
       range of the request to within the limits of the piece.
       However, if the piece is circular, then REDUCE-RANGE will convert the
       request to the equivalent position within the circle.
       The CONTINUE option causes the librarian to flag the error in the
       instruction listing and to go on to the next instruction.
       A HALT will also flag the error, but the librarian will stop the
       execution.
(* end module page.5.2.4 *)
(* begin module page.5.2.5 *)
         NOTES ON THE DELILA INSTRUCTION DEFINITION (continued)

 7) <DEFAULT RESET> (continued)
    d) Usually one would like the items (structures with headers)
       in the book to be numbered so that they can be refered to
       by number, rather than by a set of names.  To allow flexability,
       numbers can be placed in the notes of each item.  A program that
       looks at a book does not need to know about the numbers, and can
       ignore them by skipping the notes.  One can turn ON or OFF
       numbering with the <NUMBERING DEFAULT> instruction.
       The number will be printed in the book as the first line of the
       notes, identified by the word "number" or symbol "#" followed by the
       number.  One can turn on and off the printing of numbers
       at any time.  The librarian numbers each item of the book
       even if these are not printed in the book, so one would create
       gaps in the numbers of the book, although each number is unique.
       However, one may restart the numbering at any integer.
       Stating a particular set of structures in the instructions will
       determine which structures are numbered in the book.

    All the defaults have initial values:

    default type       initial value
    ============       ==============
    KEY
         NOTE           ON
         MARKER         ON
         TRANSCRIPT     ON
         GENE           ON

    SITE
         EXPAND         ON
         MODIFY         OFF
         CLEAVE         OFF

    OUT-OF-RANGE        HALT

    NUMBERING           ON, 1, ALL


 8) One may insert notes into the book with <NOTE INSERTION>.  The
    notes must be in quotes.  They will be inserted into the header
    notes of the next structure node of the book tree.  A single note
    may contain several lines.  This command is independent of the key
    note default since that default refers to the library notes only.
    Thus one may turn off the library notes and insert one's own notes.
(* end module page.5.2.5 *)
(* begin module page.5.2.6 *)
         NOTES ON THE DELILA INSTRUCTION DEFINITION (continued)

 8) Below are some fatal errors that cause the librarian to halt.
    The first errors will prevent the book from being printed at all:

    - a Delila statement can not be interpreted.

    - the request implies a traversal of the library tree which would take
      more specifications than were given  (An Illegal Tree Traversal).
      For example,       ... GENE X; ENZYME Y;
      is not legal since the recognition class was not specified.


    These errors will prevent the rest of a book from being printed:

    - an item can not be found in the library.

    - a structure name was used that was not previously specified.

    - respecification refers to a structure which is not on the previously
      specified piece.

    - Two markers requested in a WITH point to overlapping areas of DNA.

    - a request is made that refers to more than one piece of
      DNA.  This can not be handled by REDUCE-RANGE.

    - a BEGINNING or ENDING used in a <REQUEST> is not known
      (recorded as a value less than BEGINNING or greater than the ENDING of
      the piece).

    - the request is from a linear piece, the direction is stated,
      and the requested direction is inconsistant with the request limits,
      beginning and ending.
      This failure could be avoided by not stating the direction.

    - one attempts to cut anything but a circular piece.


         If the librarian halts then this fact is written on the first line of
    the book so that the first character is the "H" of "HALT".  All programs
    that read books should first check for this flag, and halt if they
    detect it to avoid reading from a bad book.
(* end module page.5.2.6 *)
(* begin module page.5.2.7 *)
         NOTES ON THE DELILA INSTRUCTION DEFINITION (continued)

 9) To allow more flexible (and shorter but more obscure) instruction calls,
    only the first three characters of various key words are needed.
    Characters beyond these are checked, but are not required.

    short   long:  simple definition
    =====   ========================

      ALL   ALL: GET ALL of a structure
      BEG   BEGINNING: of a piece numbering system or a structure
      CHR   CHROMOSOME: a structure with markers, transcripts, genes and pieces
      CLE   CLEAVE: to print in the book the cleaveage site of an enzyme
      COM   COMPLEMENT: a direction to allow retrieval of the DNA complement
      CON   CONTINUE: an option for out of range
      COO   COORDINATE: the coordinate system of the piece
      CUT   CUT: where to cut the piece being requested in get all piece
      DEF   DEFAULT: switches that can be turned ON or OFF
      DIR   DIRECTION: relative to the genetic map or coordinates
      ELS   ELSE: the alternative for an if chosen when the condition is false
      END   ENDING:  of a piece numbering system or a structure
      ENZ   ENZYME: a structure containing a site
      EVE   EVERY: get every occurance of this structure
      EXP   EXPAND: write out all the possible recognition sites
      FRO   FROM: the first base of the piece to be gotten
      GEN   GENE: a structure refering to a piece
      GET   GET: request for fragment subset of piece, or structure(s)
      HAL   HALT: stop the librarian
      HOM   HOMOLOGOUS: a direction the same as the library piece direction
      IF    IF: will result in one of two alternatives being taken
      KEY   KEY: a small quantity of information in the library
      MAR   MARKER: a specific change to a piece
      MOD   MODIFY: to print in the book the DNA modifications of an enzyme
      NOT   NOTE: a kind of attribute which need not be printed in the book
      NUM   NUMBERING: to number items in the book, or set number or items
      OFF   OFF: non-printing or acting of a default
      ON    ON: printing or action of a default
      ORG   ORGANISM: a structure containing one or more chromosomes
      OUT   OUT-OF-RANGE: the condition that a request is not within a piece
      PIE   PIECE: a string of bases of DNA and information about them
      REC   RECOGNITION-CLASS: a structure containing a set of enzymes
      RED   REDUCE-RANGE: shrinking a request to within the piece
      SIT   SITE: the recognition site of an enzyme
      SIZ   SIZE: of an object in bases, a function used in IF statements
      THE   THEN: the alternative for an if, chosen when the condition is true
      TIT   TITLE: a line of text to be printed in the book
      TO    TO: the last base of the piece to be gotten
      TRA   TRANSCRIPT: a structure refering to a piece
      WIT   WITH: specifies a set of markers to be applied to the DNA
(* end module page.5.2.7 *)
(* begin module page.6.1 *)
         EXAMPLE LIBRARY

    This is a small library, constructed by Gary Stormo.

    The first line of the library (which starts on the next page) tells
    us that he built it on February 18, 1980.  It contains sequences
    from both Lambda and E. coli.

    One can read the tree of the library by looking at the first character
    of each line.  For example, the first "O" of "ORGANISM" represents
    the entry to the organism Lambda.  The "O" is like an open parenthesis.
    This branch is closed some time later, before ECOLI by a second
    "ORGANISM", which is like a close parenthesis.  The series of such
    open and close parenthesis defines a tree:
         ONNCTNNTTNNTGGGGGGGGPNNDDPCOONNCTNNTTNNTGGGNNGPNNDDPTTGNNGPNNDDPCO
         (()((())(())()()()()(()())))(()((())(())()(())(()())()(())(()())))
         (                          )(                                    )
          ()(                      )  ()(                                )
             (  )(  )()()()()(    )      (  )(  )()(  )(    )()(  )(    )
              ()  ()          ()()        ()  ()    ()  ()()    ()  ()()
(* end module page.6.1 *)
(* begin module page.6.1.1 *)
          EXAMPLE LIBRARY (CONTINUED)

 * 80/02/18 16:03:38, 80/01/18 12:43:00, EXAMPLE LIBRARY
 ORGANISM
 * LAMBDA
 * BACTERIOPHAGE LAMBDA
 NOTE
 * HOST: ECOLI
 * MAP FROM PAGE 20 OF
 * THE BACTERIOPHAGE LAMBDA
 * A. D. HERSHEY ED. CSH 1971
 NOTE
 * CENTIMORGAN
 CHROMOSOME
 * LAMBDA
 * CHROMOSOME OF LAMBDA (DNA)
 * 0
 * 70
 TRANSCRIPT
 * PL
 * EARLY LEFTWARD PROMOTER
 NOTE
 * END UNKNOWN
 NOTE
 * CIO
 * 50
 * -
 * -82
 * -10000
 TRANSCRIPT
 TRANSCRIPT
 * PR
 * EARLY RIGHTWARD PROMOTER
 NOTE
 * BEGIN IS 1 OF COORDINATE SYSTEM
 * END UNKNOWN
 NOTE
 * CIO
 * 50
 * +
 * 1
 * 10000
 TRANSCRIPT
 GENE
 * CI
 * LAMBDA REPRESSOR
 * CIO
 * 50
 * -
 * -82
 * -795
 GENE
(* end module page.6.1.1 *)
(* begin module page.6.1.2 *)
          EXAMPLE LIBRARY (CONTINUED)

 GENE
 * CRO
 * NEGATIVE CONTROL OF IMMUNITY
 * CIO
 * 50
 * +
 * 19
 * 219
 GENE
 GENE
 * CII
 * ESTABLISHMENT OF IMMUNITY
 * CIO
 * 50
 * +
 * 338
 * 631
 GENE
 GENE
 * O
 * DNA REPLICATION
 * CIO
 * 50
 * +
 * 664
 * 1668
 GENE
 PIECE
 * CIO
 * BEFORE CI TO THE END OF O
 NOTE
 * NAR 4: 2137 (1977) HUMAYUN, (-816 TO -799)
 * NATURE 276: 302 (1978) SAUER, (-798 TO -82)
 * JMB 112: 265 (1977) HUMAYUN, ET AL, (-81 TO -33)
 * NATURE 272: 410 (1978) SCHWARZ, ET AL, (-32 TO +960)
 * NAR 5: 3141 (1978) SCHERER (+961 TO +1668)
 * 1 IS BEGIN OF TRANSCRIPT PR.
 NOTE
 * 50
 * LINEAR
 * +
 * -816
 * +1668
 * LINEAR
 * +
 * -816
 * +1668
 DNA
 * CCGACCAGAACACCTTGCCGATCAGCCAAACGTCTCTTCAGGCCACTGACTAGCGATAAC
 * TTTCCCCACAACGGAACAACTCTCATTGCATGGGATCATTGGGTACTGTGGGTTTAGTGG
 * TTGTAAAAACACCTGACCGCTATCCCTGATCAGTTTCTTGAAGGTAAACTCATCACCCCC
 * AAGTCTGGCTATGCAGAAATCACCTGGCTCAACAGCCTGCTCAGGGTCAACGAGAATTAA
(* end module page.6.1.2 *)
(* begin module page.6.1.3 *)
          EXAMPLE LIBRARY (CONTINUED)

 * CATTCCGTCAGGAAAGCTTGGCTTGGAGCCTGTTGGTGCGGTCATGGAATTACCTTCAAC
 * CTCAAGCCAGAATGCAGAATCACTGGCTTTTTTGGTTGTGCTTACCCATCTCTCCGCATC
 * ACCTTTGGTAAAGGTTCTAAGCTCAGGTGAGAACATCCCTGCCTGAACATGAGAAAAAAC
 * AGGGTACTCATACTCACTTCTAAGTGACGGCTGCATACTAACCGCTTCATACATCTCGTA
 * GATTTCTCTGGCGATTGAAGGGCTAAATTCTTCAACGCTAACTTTGAGAATTTTTGCAAG
 * CAATGCGGCGTTATAAGCATTTAATGCATTGATGCCATTAAATAAAGCACCAACGCCTGA
 * CTGCCCCATCCCCATCTTGTCTGCGACAGATTCCTGGGATAAGCCAAGTTCATTTTTCTT
 * TTTTTCATAAATTGCTTTAAGGCGACGTGCGTCCTCAAGCTGCTCTTGTGTTAATGGTTT
 * CTTTTTTGTGCTCATACGTTAAATCTATCACCGCAAGGGATAAATATCTAACACCGTGCG
 * TGTTGACTATTTTACCTCTGGCGGTGATAATGGTTGCATGTACTAAGGAGGTTGTATGGA
 * ACAACGCATAACCCTGAAAGATTATGCAATGCGCTTTGGGCAAACCAAGACAGCTAAAGA
 * TCTCGGCGTATATCAAAGCGCGATCAACAAGGCCATTCATGCAGGCCGAAAGATTTTTTT
 * AACTATAAACGCTGATGGAAGCGTTTATGCGGAAGAGGTAAAGCCCTTCCCGAGTAACAA
 * AAAAACAACAGCATAAATAACCCCGCTCTTACACATTCCAGCCCTGAAAAAGGGCATCAA
 * ATTAAACCACACCTATGGTGTATGCATTTATTTGCATACATTCAATCAATTGTTATCTAA
 * GGAAATACTTACATATGGTTCGTGCAAACAAACGCAACGAGGCTCTACGAATCGAGAGTG
 * CGTTGCTTAACAAAATCGCAATGCTTGGAACTGAGAAGACAGCGGAAGCTGTGGGCGTTG
 * ATAAGTCGCAGATCAGCAGGTGGAAGAGGGACTGGATTCCAAAGTTCTCAATGCTGCTTG
 * CTGTTCTTGAATGGGGGGTCGTTGACGACGACATGGCTCGATTGGCGCGACAAGTTGCTG
 * CGATTCTCACCAATAAAAAACGCCCGGCGGCAACCGAGCGTTCTGAACAAATCCAGATGG
 * AGTTCTGAGGTCATTACTGGATCTATCAACAGGAGTCATTATGACAAATACAGCAAAAAT
 * ACTCAACTTCGGCAGAGGTAACTTTGCCGGACAGGAGCGTAATGTGGCAGATCTCGATGA
 * TGGTTACGCCAGACTATCAAATATGCTGCTTGAGGCTTATTCGGGCGCAGATCTGACCAA
 * GCGACAGTTTAAAGTGCTGCTTGCCATTCTGCGTAAAACCTATGGGTGGAATAAACCAAT
 * GGACAGAATCACCGATTCTCAACTTAGCGAGATTACAAAGTTACCTGTCAAACGGTGCAA
 * TGAAGCCAAGTTAGAACTCGTCAGAATGAATATTATCAAGCAGCAAGGCGGCATGTTTGG
 * ACCAAATAAAAACATCTCAGAATGGTGCATCCCTCAAAACGAGGGAAAATCCCCTAAAAC
 * GAGGGATAAAACATCCCTCAAATTGGGGGATTGCTATCCCTCAAAACAGGGGGACACAAA
 * AGACACTATTACAAAAGAAAAAAGAAAAGATTATTCGTCAGAGAATTCTGGCGAATCCTC
 * TGACCAGCCAGAAAACGACCTTTCTGTGGTGAAACCGGATGCTGCAATTCAGAGCGGCAG
 * CAAGTGGGGGACAGCAGAAGACCTGACCGCCGCAGAGTGGATGTTTGACATGGTGAAGAC
 * TATCGCACCATCAGCCAGAAAACCGAATTTTGCTGGGTGGGCTAACGATATCCGCCTGAT
 * GCGTGAACGTGACGGACGTAACCACCGCGACATGTGTGTGCTGTTCCGCTGGGCATGCCA
 * GGACAACTTCTGGTCCGGTAACGTGCTGAGCCCGGCCAAACTCCGCGATAAGTGGACCCA
 * ACTCGAAATCAACCGTAACAAGCAACAGGCAGGCGTGACAGCCAGCAAACCAAAACTCGA
 * CCTGACAAACACAGACTGGATTTACGGGGTGGATCTATCAAAAACATAGCCGCACAGATG
 * GTTAACTTTGACCGTGAGCAGATGCGTCGGATCGCCAACAACATGCCGGAACAGTACGAC
 * GAAAAGCCGCAGGTACAGCAGGTAG
 DNA
 PIECE
 CHROMOSOME
 ORGANISM
(* end module page.6.1.3 *)
(* begin module page.6.1.4 *)
          EXAMPLE LIBRARY (CONTINUED)

 ORGANISM
 * ECOLI
 * ESCHERICHIA COLI
 NOTE
 * THEODOR ESCHERICH.  DIED 1911. GERMAN PHYSICIAN
 * AEROBIC, GRAM-NEGATIVE ROD-SHAPED BACTERIA
 * FAMILY ENTEROBACTERIACEAE
 * NORMALLY PRESENT IN VERTEBRATE INTESTINES
 *
 * BACT. REV. 40(1): 116-167 MAR 1976
 * RECALIBRATED LINKAGE MAP OF ESCHERICHIA COLI K-12
 * BACHMANN, LOW, TAYLOR
 NOTE
 * MINUTES
 CHROMOSOME
 * ECOLI
 * MAIN CHROMOSOME OF E. COLI
 * 0
 * 100
 TRANSCRIPT
 * LACI
 * LAC REPRESSOR
 NOTE
 * END OF TRANSCRIPT UNKNOWN
 NOTE
 * LAC
 * 7.9
 * +
 * 1
 * 1500
 TRANSCRIPT
 TRANSCRIPT
 * LACZ
 * BETA-GALACTOSIDASE ENZYME
 NOTE
 * END OF TRANSCRIPT UNKNOWN
 NOTE
 * LAC
 * 7.9
 * +
 * 1196
 * 10000
 TRANSCRIPT
 GENE
 * LACI
 * LAC REPRESSOR
 * LAC
 * 7.9
 * +
 * 29
 * 1111
 GENE
(* end module page.6.1.4 *)
(* begin module page.6.1.5 *)
          EXAMPLE LIBRARY (CONTINUED)

 GENE
 * LACZ
 * BETA-GALACTOSIDASE ENZYME
 NOTE
 * END OF GENE UNKNOWN
 NOTE
 * LAC
 * 7.9
 * +
 * 1234
 * 10000
 GENE
 PIECE
 * LAC
 * LAC I TO LAC Z
 NOTE
 * LAC I FROM FARABAUGH, NATURE 274,765 ('78)
 * PRE-LAC I FROM CALOS, NATURE 274,762 ('78)
 * LAC Z FROM MAXAM ET AL., UNPUBLISHED, AMINO ACID SEQUENCE
 * CONFIRMED FROM ZABIN AND FOWLER IN "THE OPERON", P89
 NOTE
 * 7.9
 * LINEAR
 * -
 * -49
 * 1419
 * LINEAR
 * +
 * -49
 * 1419
 DNA
 * GACACCATCGAATGGCGGAAAACCTTTCGCGGTATGGCATGATAGCGC
 * CCGGAAGAGAGTCAATTCAGGGTGGTGAAT
 * GTGAAACCAGTAACGTTATACGATGTCGCAGAGTATGCCGGTGTCTCTTATCAGACCGTT
 * TCCCGCGTGGTGAACCAGGCCAGCCACGTTTCTGCGAAAACGCGGGAAAAAGTGGAAGCG
 * GCGATGGCGGAGCTGAATTACATTCCCAACCGCGTGGCACAACAACTGGCGGGCAAACAG
 * TCGTTGCTGATTGGCGTTGCCACCTCCAGTCTGGCCCTGCACGCGCCGTCGCAAATTGTC
 * GCGGCGATTAAATCTCGCGCCGATCAACTGGGTGCCAGCGTGGTGGTGTCGATGGTAGAA
 * CGAAGCGGCGTCGAAGCCTGTAAAGCGGCGGTGCACAATCTTCTCGCGCAACGCGTCAGT
 * GGGCTGATCATTAACTATCCGCTGGATGACCAGGATGCCATTGCTGTGGAAGCTGCCTGC
 * ACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGACCAGACACCCATCAACAGTATTATT
 * TTCTCCCATGAAGACGGTACGCGACTGGGCGTGGAGCATCTGGTCGCATTGGGTCACCAG
 * CAAATCGCGCTGTTAGCGGGCCCATTAAGTTCTGTCTCGGCGCGTCTGCGTCTGGCTGGC
 * TGGCATAAATATCTCACTCGCAATCAAATTCAGCCGATAGCGGAACGGGAAGGCGACTGG
 * AGTGCCATGTCCGGTTTTCAACAAACCATGCAAATGCTGAATGAGGGCATCGTTCCCACT
 * GCGATGCTGGTTGCCAACGATCAGATGGCGCTGGGCGCAATGCGCGCCATTACCGAGTCC
 * GGGCTGCGCGTTGGTGCGGATATCTCGGTAGTGGGATACGACGATACCGAAGACAGCTCA
 * TGTTATATCCCGCCGTCAACCACCATCAAACAGGATTTTCGCCTGCTGGGGCAAACCAGC
 * GTGGACCGCTTGCTGCAACTCTCTCAGGGCCAGGCGGTGAAGGGCAATCAGCTGTTGCCC
 * GTCTCACTGGTGAAAAGAAAAACCACCCTGGCGCCCAATACGCAAACCGCCTCTCCCCGC
 * GCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGA
(* end module page.6.1.5 *)
(* begin module page.6.1.6 *)
          EXAMPLE LIBRARY (CONTINUED)

 * GCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTAT
 * GCTTCCGGCTCGTATGTTGTGTGCAATTGTGAGCGGATAACAATTTCACACAGGAAACAG
 * CTATGACCATGATT
 * ACGGATTCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGG
 * CGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCCTTCGCCAGCTGGC
 * GTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGTAGC
 * CTGAATGGCGAATGGCGCTTTGCC
 DNA
 PIECE
 TRANSCRIPT
 * LIPO
 * MRNA FOR GENE LIPO
 * LIPO
 * 36.5
 * +
 * 1
 * 322
 TRANSCRIPT
 GENE
 * LIPO
 * OUTER MEMBRANE LIPOPROTEIN (LPP)
 NOTE
 * VERY EFFICIENT EXPRESSION
 NOTE
 * LIPO
 * 36.5
 * +
 * 39
 * 275
 GENE
 PIECE
 * LIPO
 * ENTIRE LIPOPROTEIN AND ITS TRANSCRIPT
 NOTE
 * CELL 18(4): 1109 NAKAMURA AND INOUYE
 * NEGATIVE VALUES INCREMENTED
 * 1 IS BEGIN OF TRANSCRIPT LIPO
 * ORIENTATION OF COORDINATES IS A GUESS
 NOTE
 * 36.5
 * LINEAR
 * +
 * -364
 * 449
 * LINEAR
 * +
 * -364
 * 449
 DNA
 * TGGCTCTGCAGAGCAATCTGGCACACAAAGGTGACGTTGT
 * AGTTATGGTTTCTGGTGCACTGGTACCGAGCGGCACTACT
(* end module page.6.1.6 *)
(* begin module page.6.1.7 *)
         EXAMPLE LIBRARY (CONTINUED)

 * AACACCGCATCTGTTCACGTCCTGTAATATTGCTTTTGTG
 * AATTAATTTGTATATCGGCGCTTTTTTTATTTAATCGATA
 * ACCAGAAGCAATAAAAAATCAAATCGGATTTCACTATATA
 * ATCTCACTTTATCTAAGATGAATCCGATGGAAGCATCCTG
 * TTTTCTCTCAATTTTTTTATCTAAAACCCAGCGTTCGATG
 * CTTCTTTGAGCGAACGATCAAAAATAAGTGCCTTCCCATC
 * AAAAAAATATTCTCAACATAAAAAACTTTGTGTAATACTT
 * GTAACGCTACATGGAGATTAACTCAATCTAGAGGGTATTA
 * ATAATGAAAGCTACTAAACTGGTACTGGGCGCGGTAATCC
 * TGGGTTCTACTCTGCTGGCAGGTTGCTCCAGCAACGCTAA
 * AATCGATCAGCTGTCTTCTGACGTTCAGACTCTGAACGCT
 * AAAGTTGACCAGCTGAGCAACGACGTGAACGCAATGCGTT
 * CCGACGTTCAGGCTGCTAAAGATGACGCAGCTCGTGCTAA
 * CCAGCGTCTGGACAACATGGCTACTAAATACCGCAAGTAA
 * TAGTACCTGTGAAGTGAAAAATGGCGCACATTGTGCGACA
 * TTTTTTTTGTCTGCCGTTTACCGCTACTGCGTCACGCGTA
 * ACATATTCCCTTGCTCTGGTTCACCATTCTGCGCTGACTC
 * TACTGAAGGCGCATTGCTGGCTGCGGGAGTTGCTCCACTG
 * CTCACCGAAACCGG
 DNA
 PIECE
 CHROMOSOME
 ORGANISM
(* end module page.6.1.7 *)
(* begin module page.6.2 *)
         EXAMPLE CATALOGUE FOR HUMANS

    This catalogue was made by the catalogue program.  At present
    only the names of items in the library are listed.


                               CURRENT CATALOG LISTING AS OF 80/02/18 16:03:38
         PAGE         1

 ORGANISM                    LAMBDA
      CHROMOSOME             LAMBDA
           TRANSCRIPT        PL
           TRANSCRIPT        PR
           GENE              CI
           GENE              CRO
           GENE              CII
           GENE              O
           PIECE             CIO

 ORGANISM                    ECOLI
      CHROMOSOME             ECOLI
           TRANSCRIPT        LACI
           TRANSCRIPT        LACZ
           GENE              LACI
           GENE              LACZ
           PIECE             LAC
           TRANSCRIPT        LIPO
           GENE              LIPO
           PIECE             LIPO
(* end module page.6.2 *)
(* begin module page.6.3 *)
         EXAMPLE CATALOGUE FOR THE LIBRARIAN

         The librarian uses its own catalogue to afford rapid access to items
      in the library.  In our implementation this catalogue is stored
      very efficiently in binary records.  To look at this binary, we
      use a program called loocat (look at catalogue).  The output of the
      loocat program is shown below.
         The first row contains the date of library creation.  If this not
      identical to the date in the library, the librarian will halt.  For
      the following rows, the first column indicates the type of the item
      (organism, chromosome, etc); the second its name (Lambda, PL, etc);
      the third the length of the name; the fourth the file number the
      item is found in (see appendix 2, number 3); the fifth the line within
      that file that the item begins on.


 * 800218160338       10         1         1
 O LAMBDA              6         1         2
 C LAMBDA              6         1        12
 T PL                  2         1        17
 T PR                  2         1        29
 G CI                  2         1        42
 G CRO                 3         1        51
 G CII                 3         1        60
 G O                   1         1        69
 P CIO                 3         1        78
 O ECOLI               5         1       145
 C ECOLI               5         1       159
 T LACI                4         1       164
 T LACZ                4         1       176
 G LACI                4         1       188
 G LACZ                4         1       197
 P LAC                 3         1       209
 T LIPO                4         1       257
 G LIPO                4         1       266
 P LIPO                4         1       278
(* end module page.6.3 *)
(* begin module page.6.4 *)
         EXAMPLES OF DELILA INSTRUCTIONS AND BOOKS

         There are two steps involved in producing a book:
    1)   The user writes the Delila instructions (page 6.4.1.1).
    2)   The instructions are given to the librarian, which returns a
         listing (6.4.1.2 and 6.4.1.3) and a book (6.4.1.4 and 6.4.1.5).

         The top of each listing page has the date, time, librarian version,
    pass number (described below) and page number.  The listing has two
    parts (called passes).

         In pass 1 (6.4.1.2) the librarian checks the instructions for
    proper syntax.  If errors are found, they are flagged and the
    librarian halts.
         The first line of the instructions is a comment which is ignored.
         The first column of numbers is the line number, the second is
    the statement number (which still does not work exactly ...).
    This is very useful when there are several Delila instructions
    on one line or one Delila instruction covers several lines.

         In pass 2 (6.4.1.3) the librarian attempts to search the library
    for the items.  When an item is found, it is numbered (according
    to the numbering rules).  In this case the organism and chromosome
    were numbered.  Note that the piece on line 6 is not numbered.
    This is because numbering is associated with the book, and
    specification does not create a piece in the book (although
    ORGANISM ECOLI; does make an organism).  The GET on line 7 is numbered
    indicating that a PIECE was put into the book and numbered 3.
    The numbers on the following line indicate the actual bases FROM - TO
    (in this case it does not tell us anything new).  Line 8
    specifies a transcript.  Line 9 asks for that whole LACZ
    transcript.  Note that the BEGIN of the transcript was found (^1196)
    but that the END of the transcript was not known.  The default set
    on line 3 saved us here:  the range was reduced, so the transcript
    gotten in line 9 was identical to that gotten on line 7.

         Page 6.4.1.4 shows the book obtained by these instructions.
    Things to note:
    - The title of the book
    - The numbering of each node (# 1, etc)
    - That the two DNA segments are identical.
    - That each piece BEGIN and END correlates to the listing.
(* end module page.6.4 *)
(* begin module page.6.4.1.1 *)
         EXAMPLE 1: RAW DELILA INSTRUCTIONS

 (* EXAMPLE 1 *)
 TITLE "LAC Z TRANSCRIPT";
 DEFAULT OUT-OF-RANGE REDUCE-RANGE;
 ORGANISM ECOLI;
 CHROMOSOME ECOLI;
 PIECE LAC;
 GET FROM 1196 TO 1419;
 TRANSCRIPT LACZ;
 GET ALL TRANSCRIPT;
(* end module page.6.4.1.1 *)
(* begin module page.6.4.1.2 *)
         EXAMPLE 1: LISTING

   80/02/23 15:52:36     DELILA 1.14     PASS 1           PAGE 1

      1     1  (* EXAMPLE 1 *)
      2     1  TITLE "LAC Z TRANSCRIPT";
      3     2  DEFAULT OUT-OF-RANGE REDUCE-RANGE;
      4     3  ORGANISM ECOLI;
      5     4  CHROMOSOME ECOLI;
      6     5  PIECE LAC;
      7     6  GET FROM 1196 TO 1419;
      8     6  TRANSCRIPT LACZ;
      9     8  GET ALL TRANSCRIPT;

(* end module page.6.4.1.2 *)
(* begin module page.6.4.1.3 *)
         EXAMPLE 1: LISTING (continued)

    80/02/23 15:52:36     DELILA 1.14     PASS 2           PAGE 2

      1     1  (* EXAMPLE 1 *)
      2     1  TITLE "LAC Z TRANSCRIPT";
      3     2  DEFAULT OUT-OF-RANGE REDUCE-RANGE;
      4     3  ORGANISM ECOLI;
                            #1
      5     4  CHROMOSOME ECOLI;
                              #2
      6     5  PIECE LAC;
      7     6  GET FROM 1196 TO 1419;
                 #3
                           ^1196   ^1419
      8     6  TRANSCRIPT LACZ;
                             #4
      9     8  GET ALL TRANSCRIPT;
                 #5
                                ^1196^10000^1419
 ---ERROR(S)---------------------^206^208

 206: WE DO NOT KNOW THIS LIMIT (A WARNING)
 208: OUT OF RANGE AND DEFAULT RANGE = REDUCE (A WARNING)
(* end module page.6.4.1.3 *)
(* begin module page.6.4.1.4 *)
         EXAMPLE 1: BOOK

 * 80/02/23 15:52:36, 80/02/18 16:03:38, LAC Z TRANSCRIPT
 ORGANISM
 * ECOLI
 * ESCHERICHIA COLI
 NOTE
 * # 1
 * THEODOR ESCHERICH.  DIED 1911. GERMAN PHYSICIAN
 * AEROBIC, GRAM-NEGATIVE ROD-SHAPED BACTERIA
 * FAMILY ENTEROBACTERIACEAE
 * NORMALLY PRESENT IN VERTEBRATE INTESTINES
 *
 * BACT. REV. 40(1): 116-167 MAR 1976
 * RECALIBRATED LINKAGE MAP OF ESCHERICHIA COLI K-12
 * BACHMANN, LOW, TAYLOR
 NOTE
 * MINUTES
 CHROMOSOME
 * ECOLI
 * MAIN CHROMOSOME OF E. COLI
 NOTE
 * # 2
 NOTE
 * 0.00
 * 100.00
 PIECE
 * LAC
 * LAC I TO LAC Z
 NOTE
 * # 3
 * LAC I FROM FARABAUGH, NATURE 274,765 ('78)
 * PRE-LAC I FROM CALOS, NATURE 274,762 ('78)
 * LAC Z FROM MAXAM ET AL., UNPUBLISHED, AMINO ACID SEQUENCE
 * CONFIRMED FROM ZABIN AND FOWLER IN "THE OPERON", P89
 NOTE
 * 7.90
 * LINEAR
 * -
 * -49
 * 1419
 * LINEAR
 * +
 * 1196
 * 1419
 DNA
 * AATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATGATTACGGATTCAC
 * TGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCC
 * TTGCAGCACATCCCCCCTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCC
 * CTTCCCAACAGTTGCGTAGCCTGAATGGCGAATGGCGCTTTGCC
 DNA
 PIECE
 TRANSCRIPT
 * LACZ
 * BETA-GALACTOSIDASE ENZYME
 NOTE
(* end module page.6.4.1.4 *)
(* begin module page.6.4.1.5 *)
         EXAMPLE 1: BOOK (CONTINUED)

 * # 4
 * END OF TRANSCRIPT UNKNOWN
 NOTE
 * LAC
 * 7.90
 * +
 * 1196
 * 10000
 TRANSCRIPT
 PIECE
 * LAC
 * LAC I TO LAC Z
 NOTE
 * # 5
 * LAC I FROM FARABAUGH, NATURE 274,765 ('78)
 * PRE-LAC I FROM CALOS, NATURE 274,762 ('78)
 * LAC Z FROM MAXAM ET AL., UNPUBLISHED, AMINO ACID SEQUENCE
 * CONFIRMED FROM ZABIN AND FOWLER IN "THE OPERON", P89
 NOTE
 * 7.90
 * LINEAR
 * -
 * -49
 * 1419
 * LINEAR
 * +
 * 1196
 * 1419
 DNA
 * AATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATGATTACGGATTCAC
 * TGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCC
 * TTGCAGCACATCCCCCCTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCC
 * CTTCCCAACAGTTGCGTAGCCTGAATGGCGAATGGCGCTTTGCC
 DNA
 PIECE
 CHROMOSOME
 ORGANISM
(* end module page.6.4.1.5 *)
(* begin module page.6.4.2 *)
         EXAMPLE 2: CI AND CRO GENES

    The entirety of two genes are obtained.

    Note AUG initiators and TGA or TAA terminators.

    The raw instructions are not given in the following examples,
    since they can be seen in the listings.
(* end module page.6.4.2 *)
(* begin module page.6.4.2.1 *)
         EXAMPLE 2: LIST

   80/02/23 16:10:52     DELILA 1.14     PASS 1           PAGE 1

      1     1  (* EXAMPLE 2 *)
      2     1  TITLE "LAMBDA CONTROL GENES";
      3     2  ORGANISM LAMBDA; CHROMOSOME LAMBDA;
      4     4
      5     4  GENE CI;
      6     5  GET ALL GENE;
      7     6
      8     6  GENE CRO;
      9     7  GET ALL GENE;

(* end module page.6.4.2.1 *)
(* begin module page.6.4.2.2 *)
         EXAMPLE 2: LIST (continued)

    80/02/23 16:10:52     DELILA 1.14     PASS 2           PAGE 2

      1     1  (* EXAMPLE 2 *)
      2     1  TITLE "LAMBDA CONTROL GENES";
      3     2  ORGANISM LAMBDA; CHROMOSOME LAMBDA;
                             #1                 #2
      4     4
      5     4  GENE CI;
                     #3
      6     5  GET ALL GENE;
                 #4
                          ^-82^-795
      7     6
      8     6  GENE CRO;
                      #5
      9     7  GET ALL GENE;
                 #6
                          ^19^219
(* end module page.6.4.2.2 *)
(* begin module page.6.4.2.3 *)
         EXAMPLE 2: BOOK

 * 80/02/23 16:10:52, 80/02/18 16:03:38, LAMBDA CONTROL GENES
 ORGANISM
 * LAMBDA
 * BACTERIOPHAGE LAMBDA
 NOTE
 * # 1
 * HOST: ECOLI
 * MAP FROM PAGE 20 OF
 * THE BACTERIOPHAGE LAMBDA
 * A. D. HERSHEY ED. CSH 1971
 NOTE
 * CENTIMORGAN
 CHROMOSOME
 * LAMBDA
 * CHROMOSOME OF LAMBDA (DNA)
 NOTE
 * # 2
 NOTE
 * 0.00
 * 70.00
 GENE
 * CI
 * LAMBDA REPRESSOR
 NOTE
 * # 3
 NOTE
 * CIO
 * 50.00
 * -
 * -82
 * -795
 GENE
 PIECE
 * CIO
 * BEFORE CI TO THE END OF O
 NOTE
 * # 4
 * NAR 4: 2137 (1977) HUMAYUN, (-816 TO -799)
 * NATURE 276: 302 (1978) SAUER, (-798 TO -82)
 * JMB 112: 265 (1977) HUMAYUN, ET AL, (-81 TO -33)
 * NATURE 272: 410 (1978) SCHWARZ, ET AL, (-32 TO +960)
 * NAR 5: 3141 (1978) SCHERER (+961 TO +1668)
 * 1 IS BEGIN OF TRANSCRIPT PR.
 NOTE
 * 50.00
 * LINEAR
 * +
 * -816
 * 1668
 * LINEAR
 * -
 * -82
 * -795
 DNA
 * ATGAGCACAAAAAAGAAACCATTAACACAAGAGCAGCTTGAGGACGCACGTCGCCTTAAA
(* end module page.6.4.2.3 *)
(* begin module page.6.4.2.4 *)
         EXAMPLE 2: BOOK (CONTINUED)

 * GCAATTTATGAAAAAAAGAAAAATGAACTTGGCTTATCCCAGGAATCTGTCGCAGACAAG
 * ATGGGGATGGGGCAGTCAGGCGTTGGTGCTTTATTTAATGGCATCAATGCATTAAATGCT
 * TATAACGCCGCATTGCTTGCAAAAATTCTCAAAGTTAGCGTTGAAGAATTTAGCCCTTCA
 * ATCGCCAGAGAAATCTACGAGATGTATGAAGCGGTTAGTATGCAGCCGTCACTTAGAAGT
 * GAGTATGAGTACCCTGTTTTTTCTCATGTTCAGGCAGGGATGTTCTCACCTGAGCTTAGA
 * ACCTTTACCAAAGGTGATGCGGAGAGATGGGTAAGCACAACCAAAAAAGCCAGTGATTCT
 * GCATTCTGGCTTGAGGTTGAAGGTAATTCCATGACCGCACCAACAGGCTCCAAGCCAAGC
 * TTTCCTGACGGAATGTTAATTCTCGTTGACCCTGAGCAGGCTGTTGAGCCAGGTGATTTC
 * TGCATAGCCAGACTTGGGGGTGATGAGTTTACCTTCAAGAAACTGATCAGGGATAGCGGT
 * CAGGTGTTTTTACAACCACTAAACCCACAGTACCCAATGATCCCATGCAATGAGAGTTGT
 * TCCGTTGTGGGGAAAGTTATCGCTAGTCAGTGGCCTGAAGAGACGTTTGGCTGA
 DNA
 PIECE
 GENE
 * CRO
 * NEGATIVE CONTROL OF IMMUNITY
 NOTE
 * # 5
 NOTE
 * CIO
 * 50.00
 * +
 * 19
 * 219
 GENE
 PIECE
 * CIO
 * BEFORE CI TO THE END OF O
 NOTE
 * # 6
 * NAR 4: 2137 (1977) HUMAYUN, (-816 TO -799)
 * NATURE 276: 302 (1978) SAUER, (-798 TO -82)
 * JMB 112: 265 (1977) HUMAYUN, ET AL, (-81 TO -33)
 * NATURE 272: 410 (1978) SCHWARZ, ET AL, (-32 TO +960)
 * NAR 5: 3141 (1978) SCHERER (+961 TO +1668)
 * 1 IS BEGIN OF TRANSCRIPT PR.
 NOTE
 * 50.00
 * LINEAR
 * +
 * -816
 * 1668
 * LINEAR
 * +
 * 19
 * 219
 DNA
 * ATGGAACAACGCATAACCCTGAAAGATTATGCAATGCGCTTTGGGCAAACCAAGACAGCT
 * AAAGATCTCGGCGTATATCAAAGCGCGATCAACAAGGCCATTCATGCAGGCCGAAAGATT
 * TTTTTAACTATAAACGCTGATGGAAGCGTTTATGCGGAAGAGGTAAAGCCCTTCCCGAGT
 * AACAAAAAAACAACAGCATAA
 DNA
 PIECE
 CHROMOSOME
 ORGANISM
(* end module page.6.4.2.4 *)
(* begin module page.6.4.3 *)
         EXAMPLE 3: GENE STARTS

         With a few simple commands we get 7 translation initiation
    regions.  By setting the numbering command, only pieces are numbered,
    starting at zero.  All the library notes are suppressed, as are
    all the genes in the list.  User notes are inserted.

         Note that in PASS 2, the <RELATIVE>s are displayed along with
    the final location on the PIECE.  An example of this is on
    line 12 (instruction 10, 6.4.3.2), where the GENE began at -82, but
    the relative was 20, so the resultant PIECE began at -62.
(* end module page.6.4.3 *)
(* begin module page.6.4.3.1 *)
         EXAMPLE 3: LIST

   80/02/23 16:31:13     DELILA 1.14     PASS 1           PAGE 1

      1     1  (* EXAMPLE 3 *)
      2     1  TITLE "GENE STARTS";
      3     2  DEFAULT NUMBERING PIECE;
      4     3  DEFAULT NUMBERING 0;
      5     4  DEFAULT KEY NOTE OFF;
      6     5  DEFAULT KEY GENE OFF;
      7     6
      8     6  NOTE "GENE INITIATION REGIONS OF LAMBDA.";
      9     7  ORGANISM LAMBDA;
     10     8  CHROMOSOME LAMBDA;
     11     9  GENE CI; (* A REVERSED GENE *)
     12    10  GET FROM GENE BEGINNING +20 TO GENE BEGINNING -10;
     13    11  GENE CRO;
     14    12  GET FROM GENE BEGINNING -20 TO GENE BEGINNING +10;
     15    13  GENE CII;
     16    14  GET FROM GENE BEGINNING -20 TO GENE BEGINNING +10;
     17    15  GENE O;
     18    16  GET FROM GENE BEGINNING -20 TO GENE BEGINNING +10;
     19    17
     20    17  NOTE "GENE INITIATION REGIONS OF ECOLI.";
     21    18  ORGANISM ECOLI;
     22    19  CHROMOSOME ECOLI;
     23    20  GENE LACI;
     24    21  GET FROM GENE BEGINNING -20 TO GENE BEGINNING +10;
     25    22  GENE LACZ;
     26    23  GET FROM GENE BEGINNING -20 TO GENE BEGINNING +10;
     27    24  GENE LIPO;
     28    25  GET FROM GENE BEGINNING -20 TO GENE BEGINNING +10;

(* end module page.6.4.3.1 *)
(* begin module page.6.4.3.2 *)
         EXAMPLE 3: LIST (continued)

    80/02/23 16:31:13     DELILA 1.14     PASS 2           PAGE 2

      1     1  (* EXAMPLE 3 *)
      2     1  TITLE "GENE STARTS";
      3     2  DEFAULT NUMBERING PIECE;
      4     3  DEFAULT NUMBERING 0;
                                 ^0
      5     4  DEFAULT KEY NOTE OFF;
      6     5  DEFAULT KEY GENE OFF;
      7     6
      8     6  NOTE "GENE INITIATION REGIONS OF LAMBDA.";
      9     7  ORGANISM LAMBDA;
     10     8  CHROMOSOME LAMBDA;
     11     9  GENE CI; (* A REVERSED GENE *)
     12    10  GET FROM GENE BEGINNING +20 TO GENE BEGINNING -10;
                 #0
                                     ^-82^20^-62           ^-82^-10^-92
     13    11  GENE CRO;
     14    12  GET FROM GENE BEGINNING -20 TO GENE BEGINNING +10;
                 #1
                                     ^19 ^-20^-1           ^19 ^10^29
     15    13  GENE CII;
     16    14  GET FROM GENE BEGINNING -20 TO GENE BEGINNING +10;
                 #2
                                     ^338^-20^318          ^338^10^348
     17    15  GENE O;
     18    16  GET FROM GENE BEGINNING -20 TO GENE BEGINNING +10;
                 #3
                                     ^664^-20^644          ^664^10^674
     19    17
     20    17  NOTE "GENE INITIATION REGIONS OF ECOLI.";
     21    18  ORGANISM ECOLI;
     22    19  CHROMOSOME ECOLI;
     23    20  GENE LACI;
     24    21  GET FROM GENE BEGINNING -20 TO GENE BEGINNING +10;
                 #4
                                     ^29 ^-20^9            ^29 ^10^39
     25    22  GENE LACZ;
     26    23  GET FROM GENE BEGINNING -20 TO GENE BEGINNING +10;
                 #5
                                     ^1234^-20^1214        ^1234^10^1244
     27    24  GENE LIPO;
     28    25  GET FROM GENE BEGINNING -20 TO GENE BEGINNING +10;
                 #6
                                     ^39 ^-20^19           ^39 ^10^49
(* end module page.6.4.3.2 *)
(* begin module page.6.4.3.3 *)
         EXAMPLE 3: BOOK

 * 80/02/23 16:31:13, 80/02/18 16:03:38, GENE STARTS
 ORGANISM
 * LAMBDA
 * BACTERIOPHAGE LAMBDA
 NOTE
 * GENE INITIATION REGIONS OF LAMBDA.
 NOTE
 * CENTIMORGAN
 CHROMOSOME
 * LAMBDA
 * CHROMOSOME OF LAMBDA (DNA)
 * 0.00
 * 70.00
 PIECE
 * CIO
 * BEFORE CI TO THE END OF O
 NOTE
 * # 0
 NOTE
 * 50.00
 * LINEAR
 * +
 * -816
 * 1668
 * LINEAR
 * -
 * -62
 * -92
 DNA
 * TGCGGTGATAGATTTAACGTATGAGCACAAA
 DNA
 PIECE
 PIECE
 * CIO
 * BEFORE CI TO THE END OF O
 NOTE
 * # 1
 NOTE
 * 50.00
 * LINEAR
 * +
 * -816
 * 1668
 * LINEAR
 * +
 * -1
 * 29
 DNA
 * GCATGTACTAAGGAGGTTGTATGGAACAACG
 DNA
 PIECE
 PIECE
 * CIO
 * BEFORE CI TO THE END OF O
(* end module page.6.4.3.3 *)
(* begin module page.6.4.3.4 *)
         EXAMPLE 3: BOOK (continued)

 NOTE
 * # 2
 NOTE
 * 50.00
 * LINEAR
 * +
 * -816
 * 1668
 * LINEAR
 * +
 * 318
 * 348
 DNA
 * ATCTAAGGAAATACTTACATATGGTTCGTGC
 DNA
 PIECE
 PIECE
 * CIO
 * BEFORE CI TO THE END OF O
 NOTE
 * # 3
 NOTE
 * 50.00
 * LINEAR
 * +
 * -816
 * 1668
 * LINEAR
 * +
 * 644
 * 674
 DNA
 * ATCTATCAACAGGAGTCATTATGACAAATAC
 DNA
 PIECE
 CHROMOSOME
 ORGANISM
 ORGANISM
 * ECOLI
 * ESCHERICHIA COLI
 NOTE
 * GENE INITIATION REGIONS OF ECOLI.
 NOTE
 * MINUTES
 CHROMOSOME
 * ECOLI
 * MAIN CHROMOSOME OF E. COLI
 * 0.00
 * 100.00
 PIECE
 * LAC
 * LAC I TO LAC Z
 NOTE
 * # 4
 NOTE
(* end module page.6.4.3.4 *)
(* begin module page.6.4.3.5 *)
         EXAMPLE 3: BOOK (continued)

 * 7.90
 * LINEAR
 * -
 * -49
 * 1419
 * LINEAR
 * +
 * 9
 * 39
 DNA
 * GTCAATTCAGGGTGGTGAATGTGAAACCAGT
 DNA
 PIECE
 PIECE
 * LAC
 * LAC I TO LAC Z
 NOTE
 * # 5
 NOTE
 * 7.90
 * LINEAR
 * -
 * -49
 * 1419
 * LINEAR
 * +
 * 1214
 * 1244
 DNA
 * ATTTCACACAGGAAACAGCTATGACCATGAT
 DNA
 PIECE
 PIECE
 * LIPO
 * ENTIRE LIPOPROTEIN AND ITS TRANSCRIPT
 NOTE
 * # 6
 NOTE
 * 36.50
 * LINEAR
 * +
 * -364
 * 449
 * LINEAR
 * +
 * 19
 * 49
 DNA
 * CAATCTAGAGGGTATTAATAATGAAAGCTAC
 DNA
 PIECE
 CHROMOSOME
 ORGANISM
(* end module page.6.4.3.5 *)
(* begin module page.6.4.4 *)
         EXAMPLE 4: OTHER DELILA INSTRUCTIONS

 (*         EXAMPLE 4:  OTHER DELILA INSTRUCTIONS *)

    TITLE "16S RNA VS WONDEROUS BEASTS";
         (* BY THE WIZARD OF ID,
            JANUARY 1, 2001 *)

    (* SET DEFAULTS *)
    DEFAULT OUT-OF-RANGE REDUCE-RANGE;
    DEFAULT KEY TRANSCRIPT OFF;
    DEFAULT KEY NOTE OFF;
    DEFAULT KEY MARK OFF;

    (* GET THE 16S RIBOSOMAL RNA *)
    ORGANISM ECOLI;
    CHROMOSOME ECOLI;
    TRANSCRIPT 16SRRNAC;
    GET ALL TRANSCRIPT;

    (* GET THE FIRST BEAST *)
    ORGANISM PHIX174;
    CHROMOSOME PHIX174;
    PIECE PHIX174;
    (* GET THE CIRCULAR DNA AND ALL GENES
       (SINCE MARKERS AND TRANSCRIPTS ARE OFF) *)
    GET ALL PIECE CUT (TRANSCRIPT A) BEGIN +7;

    (* RESET NOTES *)
    DEFAULT KEY NOTE ON;

    (* GET THE NEXT BEAST *)
    ORGANISM SCHMOO;
    CHROMOSOME I;
    TRANSCRIPT SUMET;
    GENE SCHMIN;
    GET FROM GENE BEGIN TO TRANSCRIPT END;
    GET FROM (GENE CILL) END -3
    TO (GENE SCHMBR) BEGIN +8;

    (* GET 50 BASES OF DNA, STARTING AT THE BEGINNING OF THE
       COORDINATE SYSTEM OF THE PIECE IMPLIED BY THE PREVIOUS REQUESTS *)
    GET FROM COORDINATE BEGINNING
          TO COORDINATE BEGINNING +49;

    (* TRY A COMPLEX REQUEST *)
    IF SIZE(TRANSCRIPT BEGIN TO GENE BEGIN)>=150
         THEN GET FROM GENE BEGIN -150
              TO  GENE BEGIN +150;
         ELSE GET FROM TRANSCRIPT BEGIN
              TO  TRANSCRIPT BEGIN +300;

    (* IF WE WANTED A RESTRICTION ENZYME SITE, WE WOULD USE THESE COMMANDS: *)
    RECOGNITION-CLASS RESTRICT;
    ENZYME ECOR1;
    (* THE DEFAULT WILL GET SNIP5 EXPANDED INTO ALL COMBINATIONS,
       TO GET THE CLEAVEAGE SITES, WE DO: *)
    DEFAULT SITE CLEAVE ON;
    ENZYME SNIP5;
(* end module page.6.4.4 *)
(* begin module page.6.5 *)
         EXAMPLE OF AN AUXILARY PROGRAM: PARSE

         On this page is an example of the output of a program that
      reads a book, in this case, from example 3.  This one is called
      "parse" since it parses the book and prints out the parts.  Note
      that the ATG's in the sequences all line up, indicating that all the
      programs worked correctly (but not proving it!).


 * 80/02/23 16:31:13, 80/02/18 16:03:38, GENE STARTS
 #   0 (CIO       )  BEGIN:    -62; END:    -92 (    31BP) <TGCGGTGATAGATTTAACGT
ATGAGCACAAA>
 #   1 (CIO       )  BEGIN:     -1; END:     29 (    31BP) <GCATGTACTAAGGAGGTTGT
ATGGAACAACG>
 #   2 (CIO       )  BEGIN:    318; END:    348 (    31BP) <ATCTAAGGAAATACTTACAT
ATGGTTCGTGC>
 #   3 (CIO       )  BEGIN:    644; END:    674 (    31BP) <ATCTATCAACAGGAGTCATT
ATGACAAATAC>
 #   4 (LAC       )  BEGIN:      9; END:     39 (    31BP) <GTCAATTCAGGGTGGTGAAT
GTGAAACCAGT>
 #   5 (LAC       )  BEGIN:   1214; END:   1244 (    31BP) <ATTTCACACAGGAAACAGCT
ATGACCATGAT>
 #   6 (LIPO      )  BEGIN:     19; END:     49 (    31BP) <CAATCTAGAGGGTATTAATA
ATGAAAGCTAC>
(* end module page.6.5 *)
(* begin module page.6.6 *)
         EXAMPLE OF AN AUXILARY PROGRAM: LISTER

    Like PARSE, LISTER shows the DNA in a book.  However, LISTER shows
    the correct numbering.  Three modes are available in this version:
         0 - list DNA
         1 - list DNA, predicting potential peptides (as below)
         2 - list DNA, showing all possible amino acids.

    The Delila instructions were:

    TITLE "LIPO TRANSCRIPT";
    ORGANISM ECOLI; CHROMOSOME ECOLI; TRANSCRIPT LIPO; GET ALL TRANSCRIPT;

    As can be seen in the example library, the lipoprotein begins at 39.
    Note that numbering is not ambiguous, and that all stop codons are listed.


 * 80/06/08 19:13:26, 80/02/18 16:03:38, LIPO TRANSCRIPT
 BOOK LISTER 2.18.  A '/' MEANS END OF COORDINATES.

 ORGANISM ECOLI; CHROMOSOME ECOLI;
 PIECE LIPO       #4  CONFIG: LINEAR  DIRECTION: +  BEGIN: 1,  END: 322

         *         *10       *         *20       *         *30       *         *
40       *         *50       *         *60
 G C T A C A T G G A G A T T A A C T C A A T C T A G A G G G T A T T A A T A A T
 G A A A G C T A C T A A A C T G G T A C
  ALA - THR - TRP - ARG - LEU - THR - GLN - SER - ARG - GLY - TYR -OCHRE-OCHRE-O
PAL -     -     -     -     -     -     -
    LEU - HIS - GLY - ASP -OCHRE-     -     -     -     -     -     -     -
-     -     -     -OCHRE-     -     -     -
      TYR - MET - GLU - ILE - ASN - SER - ILE -AMBER-     -     -     -     -FME
T - LYS - ALA - THR - LYS - LEU - VAL - LEU -

         *         *70       *         *80       *         *90       *         *
100      *         *110      *         *120
 T G G G C G C G G T A A T C C T G G G T T C T A C T C T G C T G G C A G G T T G
 C T C C A G C A A C G C T A A A A T C G
      -     -     -OCHRE-     -     -     -     -     -     -     -     -     -
    -     -     -     -     -     -     -
        -     -     -     -     -     -     -     -     -     -     -     -
-     -     -     -     -OCHRE-     -     -
      GLY - ALA - VAL - ILE - LEU - GLY - SER - THR - LEU - LEU - ALA - GLY - CY
S - SER - SER - ASN - ALA - LYS - ILE - ASP -

         *         *130      *         *140      *         *150      *         *
160      *         *170      *         *180
 A T C A G C T G T C T T C T G A C G T T C A G A C T C T G A A C G C T A A A G T
 T G A C C A G C T G A G C A A C G A C G
      -     -     -     -     -     -     -     -     -OPAL -     -     -     -
    -     -     -OPAL -     -     -     -
        -     -     -     -OPAL -     -     -     -     -     -     -OCHRE-
-OPAL -     -     -     -     -     -     -
      GLN - LEU - SER - SER - ASP - VAL - GLN - THR - LEU - ASN - ALA - LYS - VA
L - ASP - GLN - LEU - SER - ASN - ASP - VAL -

         *         *190      *         *200      *         *210      *         *
220      *         *230      *         *240
 T G A A C G C A A T G C G T T C C G A C G T T C A G G C T G C T A A A G A T G A
 C G C A G C T C G T G C T A A C C A G C
 OPAL -     -     -     -     -     -     -     -     -     -     -     -FMET -
THR - GLN - LEU - VAL - LEU - THR - SER -
        -     -     -     -     -     -     -     -     -     -OCHRE-     -OPAL
-     -     -     -     -OCHRE-     -     -
      ASN - ALA - MET - ARG - SER - ASP - VAL - GLN - ALA - ALA - LYS - ASP - AS
P - ALA - ALA - ARG - ALA - ASN - GLN - ARG -

         *         *250      *         *260      *         *270      *         *
280      *         *290      *         *300
 G T C T G G A C A A C A T G G C T A C T A A A T A C C G C A A G T A A T A G T A
 C C T G T G A A G T G A A A A A T G G C
  VAL - TRP - THR - THR - TRP - LEU - LEU - ASN - THR - ALA - SER - ASN - SER -
THR - CYS - GLU - VAL - LYS - ASN - GLY -
        -     -     -     -     -     -OCHRE-     -     -     -     -     -
-     -FMET - LYS -OPAL -     -FMET - ALA -
      LEU - ASP - ASN - MET - ALA - THR - LYS - TYR - ARG - LYS -OCHRE-AMBER-
  -     -OPAL -     -     -     -     -     -

         *         *310      *         *320
 G C A C A T T G T G C G A C A T T T T T T T
  ALA - HIS - CYS - ALA - THR - PHE - PHE -
    HIS - ILE - VAL - ARG - HIS - PHE - PHE -
          -     -     -     -     -     -
(* end module page.6.6 *)
(* begin module page.a.1 *)
         APPENDIX 1:  DATA BASING TECHNIQUES

    The library structure presented in the definition has a rigid order in
    the keys (but not for substructures of a structure).  The
    important mental concept (schema) for the library can be reached by:

 1) elimination of the words surrounding each structure while retaining
    the tree structure.  These words allow one to store a tree linearly, but a
    tree can be stored in other ways, for example with pointers.

 2) allowing both the structures and keys to have any order.
    In this definition, structures are in any order, while keys have a fixed
    order.  To allow keys to have any order, one must identify each key
    by name and provide a slot for each value (attribute).

       The librarian program can provide "uncoupling" so that one could change
    the format of the library and keep the same format book, or change the
    book format without the need to change the library.  (In both cases, the
    librarian would change.)

       One could also define and insert new forms of data into the library
    without changing the definition of the book.  So the books could retain
    the format given, while the entire format of the library is altered.
    (Meanwhile, new routines would create the new books.)


       From COMPUTER DATA-BASE ORGANIZATION
            by James Martin Prentice-Hall 1977:

    data independence.  The property of being able to change the overall
       logical or physical structure of the data without changing the
       application program's view of the data.

    logical data independence.  The property of being able to change the
       overall logical structure of the data base (schema) without changing the
       application program's view of the data.

    physical data independence.  The property of being able to change the
       physical structure of the data without changing the logical structure.
(* end module page.a.1 *)
(* begin module page.a.2 *)
         APPENDIX 2:  IMPLEMENTATION

 1) Interactive vs Compile
       Notes on Interactive Librarian (IL)
    a) the IL shows a list of organisms.  One "points" to one of them,
    by hand or by name or by pen, and the IL shows the header key and then
    the chromosomes.  One continues to specify and move up and down the
    library tree.  One can look at pieces of DNA, and save the results in
    a book.
         One can walk around and explore the tree.

    b) Files of Compile Librarian:
         LIBRARY,CATALOGUE,INSTRUCTIONS,INSTRUCTION LISTING,BOOK,OUTPUT
    output is used to flag Librarian run time errors only.  Errors in
    INSTRUCTIONS go into the INSTRUCTION LISTING.

    Files for Interactive Librarian:
         LIBRARY,CATALOGUE,INPUT(INSTRUCTIONS),OUTPUT(TERMINAL),BOOK
    errors,interactive messages on output.

    c) Can we have the interactive language the same as the instruction
       set defined so far?
       One difference: the ability to look at DNA and information
       without sending it to a book.

 2) Method to invert the order of a sequence:
    define a structure which contains a packed array of bases (allows
    2 bits/nucleotide) say, n=1000 bases long.
         use the function new(p);  (Pascal User Man. p. 105)
    to allocate as much memory as is needed for the particular
    sequence. If n=1000, and the sequence is 6408 bases long,
    one would need to allocate 7 of the arrays.
    This allows any length sequence to be inverted (up to computer limits)
    without requiring storeage of one huge array all the time.

 3) The library can be divided into several files to reduce search time.
    This will not change the definition:   It will be completely
    TRANSPARENT to a user (s/he need not know about this.)
    The catalogue should take care of what data are in which file.

 4) There should be methods to generate and insert sequences into the library;
    methods to modify or correct the library.
    Specification just as in instructions, prompting for keys, ability
    to delete parts. (Library Editor)

 INSERTS ---------->:                    :-------> NEW LIBRARY
                    :    CATALOGUE       :
                    :-------- * -------->:-------> NEW CATALOGUE FOR LIBRARIAN
                    :      PROGRAM       :
 OLD LIBRARY ------>:                    :-------> NEW CATALOGUES FOR HUMANS

 5) In our librarian:
    a) The family concept is followed (see catalogue definition), so
       that all marker, transcript and gene references are to pieces of DNA
       below them in the library.
    b) Each library is a file.  The librarian accesses several files at once.
    c) No organisms or recognition classes are split between files.
(* end module page.a.2 *)
(* begin module page.a.3 *)
         APPENDIX 3: LIBRARY DESIGN PHILOSOPHY

    Our librarian code was written to allow easy expansion and development.
    1)   A single document (this one) defines the library.
         This allows design of a system similar to ours de novo (without
         our code).  Most of this document was written prior to ANY of the
         library code.  It is updated continuously.
         The code was written in a form that mirrors this definition,
         so it was quite easy to write, even though it is over 4400 lines long.

    2)   The code itself is heavily commented (around 900 lines of comment
         in 4000 lines of code, or 20%) to allow others to understand it.

    3)   The code was intentionally written for clarity and ease of
         transport.  Top-down structuring is emphasized: the librarian
         contains around 180 procedures and functions.

    4)   The language used was PASCAL.


    REQUIREMENTS MET IN THIS DNA SEQUENCE LIBRARIAN
    1)   General applicability to diverse problems - flexibility.
    2)   The sequence request language must be close to common terminology.
    3)   Sequences are numbered as in the original paper when possible.
         (The major exceptions are numbering systems that do not include a
         zero: linear fragments of circles can have complex numbering, so
         why make things more complex than they must be already?  See the
         NOTES ON LIBRARY DEFINITION.)
    4)   Ease of use, simple commands.
    5)   Reasonably efficient access to large data base.
    6)   Reduction of library redundancy to increase consistency of
         results:  only one sequence is stored, complements and
         subsections of the sequence are CALCULATED:  The librarian must
         handle all possible DNA fragments: both linear and circular,
         in both orientations.
    7)   Portability to other computers.


    SOME PROPERTIES
         The library system handles sets of DNA sequence fragments (books).
    This is its source of power.  With this system, one has the ability
    to create a special format library from the original library since
    a book is identical to a library.  We have used this feature to create
    a library that contains only the transcripts of the original library
    (That assures that books contain only information coded in RNA.)
    This also allows a recursive use of books as libraries for searches.
    Alternatively, the librarian code is flexible enough that we
    could easily change it to generate books of a completely different
    format than that of the library.  Obviously all we needed to do was
    write the proper book reading/writing routines, and we got the
    best of both worlds.
(* end module page.a.3 *)
(* begin module page.to.do *)
         THINGS TO DO AND QUESTIONS TO ANSWER

 1) make alphabetized BNF of Delila instructions.

 2) Make a pure BNF for library definition


 Tab sets for this file (LIBDEF) are:
 5,10,15,20,25,30,35,40,45,50
(* end module page.to.do *)