(* begin module version *) version = 1.04 of libdef 1987 Feburary 10 Last major changes: 1980 June 9 LL IIIIIIII BBBBBBB DDDDDDD EEEEEEEE FFFFFFFF LL II BB BB DD DD EE FF LL II BB BB DD DD EE FF LL II BBBBBBB DD DD EEEE FFFF LL II BB BB DD DD EE FF LL II BB BB DD DD EE FF LL II BB BB DD DD EE FF LL II BB BB DD DD EE FF LLLLLLLL IIIIIIII BBBBBBB DDDDDDD EEEEEEEE FF (* end module version *) (* begin module page.0 *) ORGANISM and RECOGNITION CLASS LIBRARY DEFINITION: a DNA sequence data-base THOMAS SCHNEIDER, MCD Biology University of Colorado, Boulder Colorado COPYRIGHT (C) 1987 This document defines a way of organizing and storing the DNA sequence information from many organisms. The library of sequences is structured so that a person can easily pull out sets of sequences for study. The characteristics of these sequence sets, called "books", are also described. One can obtain a book using a language called Delila: DEoxyribonucleic acid LIbrary LAnguage. The structure of the library also allows easy updating and expansion as new sequences are obtained. Feel free to refer to the examples (pages 6). page 1.1 LIBRARY DEFINITION page 1.2 NOTES ON THE LIBRARY DEFINITION page 2.1 BOOK DEFINITION page 2.2 NOTES ON THE BOOK DEFINITION page 3 CATALOGUE DEFINITION page 4 LIBRARIAN page 5.1 DELILA INSTRUCTIONS page 5.2 NOTES ON THE DELILA INSTRUCTION DEFINITION page 6.1 EXAMPLE LIBRARY page 6.2 EXAMPLE CATALOGUE FOR HUMANS page 6.3 EXAMPLE CATALOGUE FOR THE LIBRARIAN page 6.4 EXAMPLES OF DELILA INSTRUCTIONS AND BOOKS page 6.4.1 EXAMPLE 1: lacZ TRANSCRIPT page 6.4.2 EXAMPLE 2: cI AND cro GENES page 6.4.3 EXAMPLE 3: GENE STARTS page 6.4.4 EXAMPLE 4: OTHER DELILA INSTRUCTIONS page 6.5 EXAMPLE OF AN AUXILARY PROGRAM: PARSE page 6.6 EXAMPLE OF AN AUXILARY PROGRAM: LISTER page a.1 APPENDIX 1: DATA-BASING TECHNIQUES page a.2 APPENDIX 2: IMPLEMENTATION page a.3 APPENDIX 3: LIBRARY DESIGN PHILOSOPHY page to do 1 THINGS-TO DO AND QUESTIONS TO ANSWER (* end module page.0 *) (* begin module page.1.1.1 *) LIBRARY DEFINITION SCHEMA Below is an overview of the structure of the library. A-->>--B means A has one or more of B. C--->--D means C has one of D. LIBRARY : : V V V V : : ............: :............. : : ORGANISM RECOGNITION-CLASS : : V V V V : : CHROMOSOME : : : : : : V V V V : V V V V : : : : : : ............: : : :........ : : ......: :.... : : : : : : : MARKER TRANSCRIPT GENE PIECE...... ENZYME : : : : : : : : : V V V V : : : V V : : : : : : : : V : : : :.....: : : : : : : :...................: : : : : :...........................: : : : : : DNA DNA RECOGNITION-SITE On the next three pages the structure of the library is defined. This is followed by explanatory notes. Words surrounded by pointed brackets (greater than and less than) are things to be defined. We want to define the . ::= means "is defined to be". . . . means there can be none or more. Lines starting with an *, (eg. "* KEY NAME OF STRUCTURE") represent information stored in the library. The definition is in a modified Backus-Naur Form (BNF). One could read the definition as: "A library consists of a line of information which gives its dates of creation and title, and a set of organisms and recognition classes. An organism consists of a line with the capitalized word ORGANISM, an organism key, and a set of chromosomes . . ." (* end module page.1.1.1 *) (* begin module page.1.1.2 *) LIBRARY DEFINITION (continued) LIBRARY STRUCTURE ::= [ * DATE OF CREATION, DATE OF SOURCE LIBRARY, TITLE [ [ [ . . . [ [ [ [ . . . [ ::= [ ORGANISM [ [ [ [ . . . [ [ ORGANISM ::= [ CHROMOSOME [ [ [ [ . . . [ [ [ [ . . . [ [ [ [ . . . [ [ [ [ . . . [ [ CHROMOSOME ::= [ MARKER [ [ [ MARKER ::= [ TRANSCRIPT [ [ TRANSCRIPT ::= [ GENE [ [ GENE (* end module page.1.1.2 *) (* begin module page.1.1.3 *) LIBRARY DEFINITION (continued) ::= [ PIECE [ [ [ PIECE ::= [ DNA [ * a string of A, C, G, or T, 5' to 3' [ * . . . [ DNA ::= [ RECOGNITION-CLASS [ [ [ [ . . . [ [ RECOGNITION-CLASS ::= [ ENZYME [ [ [ [ . . . [ [ ENZYME ::= [ SITE [ * a string of A, C, G, T, PU, PY or N, 5' to 3' [ * with special characters (,/,),V,*,? [ * . . . [ SITE (* end module page.1.1.3 *) (* begin module page.1.1.4 *) LIBRARY DEFINITION (continued) LIBRARY KEYS
::= [ * KEY NAME OF STRUCTURE [ * FULL NAME OF STRUCTURE [ ::= [ NOTE [ * NOTES AND SPECIAL INFORMATION [ * . . . [ NOTE ::= [
[ * GENETIC MAP UNITS ::= [
[ * GENETIC MAP BEGINNING (REAL NUMBER) [ * GENETIC MAP ENDING (REAL NUMBER) ::= [ * KEY NAME OF PIECE WHERE FOUND [ * GENETIC MAP BEGINNING ON COORDINATES (REAL NUMBER) [ * DIRECTION (+/-) RELATIVE TO COORDINATES [ * BEGINNING NUCLEOTIDE (INTEGER) [ * ENDING NUCLEOTIDE (INTEGER) ::= [
[ [ * STATE (ON/OFF) [ * PHENOTYPE ::= [
[ ::= [
[ ::= [
[ * GENETIC MAP BEGINNING OF COORDINATES (REAL NUMBER) [ * COORDINATE: CONFIGURATION (CIRCULAR/LINEAR) [ * COORDINATE: DIRECTION (+/-) RELATIVE TO GENETIC MAP [ * COORDINATE: BEGINNING NUCLEOTIDE (INTEGER) [ * COORDINATE: ENDING NUCLEOTIDE (INTEGER) [ * PIECE : CONFIGURATION (CIRCULAR/LINEAR) [ * PIECE : DIRECTION (+/-) RELATIVE TO COORDINATES [ * PIECE : BEGINNING NUCLEOTIDE (INTEGER) [ * PIECE : ENDING NUCLEOTIDE (INTEGER) ::= [
::= [
(* end module page.1.1.4 *) (* begin module page.1.2.1 *) NOTES ON THE LIBRARY DEFINITION (refer to EXAMPLE LIBRARY) 1) The dates and title allow one to know what version of the library one is using. The first date is the date the library was created or modified. The second date is the date of creation of the previous version of this library. The title is a short descriptive name of the library. 2) Organisms are organized by taxonomy. At present the organisms are simply ordered. Later, they could be subdivided by kingdom, phylum, etc. 3) If the entire DNA sequence of a chromosome is known, then that chromosome will have one piece. If only parts are known, then each seperate known piece is recorded. 4) Under all circumstances, only one strand of DNA is stored, and it is always 5' to 3'. All RNA is stored like DNA, using T's (even RNA viruses). DNA could be stored in binary format 2 bits per nucleotide. 5) The various structures have a header with two names. The KEY NAME (eg. "ECOLI") is provided for rapid access to the structure. The FULL NAME (eg. "ESCHERICHIA COLI K12") provides more information than the KEY NAME. 6) The note keys are optional. They could contain: a) the source of the information in the literature, b) locations of special modifications, etc. 7) The GENETIC MAP keys allow one to relate the DNA to the genetic map. They also allow one to store markers that have not yet been sequenced. GENETIC MAP UNITS: The units with which one records genetic distances. These units can be real numbers, so all genetic positions are real numbers. GENETIC MAP BEGINNING/ENDING: The genetic range of this chromosome. GENETIC MAP BEGINNING: This number corresponds to the beginning of the coordinate system or the beginning of a structure such as a gene (see below). This means that one need not know the exact relationship between base pair and genetic distances. 8) Each piece has its own numbering system, a consecutive set of integers. These numbers form a coordinate system relative to the genetic map. Four pieces of information orient the coordinate system relative to the genetic map: a) the CONFIGURATION of the coordinate system refers to its topological shape. It is either CIRCULAR or LINEAR. A fragment of a chromosome is LINEAR, while a CIRCULAR piece is stored as a linear series of letters. b) The DIRECTION defines the orientation of the numbering system with respect to the genetic map. + means "in the same direction as". c,d) The BEGINNING and ENDING are integers which specify the limits of the coordinate system. The value of ENDING is always greater than that of BEGINNING. The coordinate system provides a framework for stating the exact numbering of the bases in the DNA of the piece. This also requires four pieces of information: configuration, direction, beginning and ending, all relative to the coordinate system. (* end module page.1.2.1 *) (* begin module page.1.2.2 *) NOTES ON THE LIBRARY DEFINITION (continued) 8) (continued) Structures, such as genes, which refer to a piece using a reference also use the coordinate system. In this way, these structures can point to stretches of DNA within a piece. One consequence of this system is that a circular piece may be stored in the library "rotated" so that the cut in the sequence (to allow linear storeage) may be anywhere on the circle. Since the numbering is consecutive, numbering systems which have no zero are modified by adding 1 to all negative numbers. This creates a zero and allows insertion into the library. 9) The marker, when applied to a piece (see instructions for how to do it) deletes the bases between (but not including) its BEGINNING and ENDING, and then inserts its DNA inbetween. Reference numbers one base outside the coordinate system may be used to modify DNA just at the edge of the coordinates. Markers will handle all forms of deletion, insertion, mutation and splicing!! In the library, all markers are OFF, meaning that the marker was not applied to the DNA in the piece. When a marker is applied to the DNA its state will be ON. If a marker does not refer to any piece, then its NAME OF PIECE WHERE FOUND will be NONE. The marker PHENOTYPE allows one to store the effects of the marker. The PHENOTYPE is like a key name: it is short to allow rapid searches. 10) For the purposes of this library, a gene is considered to be the DNA limits of a translated region of a transcript (message, mRNA, or "gene product"). Every gene has a transcript and is translated to protein. As a result, tRNA is considered to be a modified transcript, not a gene product. A "transcript" refers to DNA which codes for RNA, and a "gene" refers to DNA which codes for protein. 11) The gene limits (BEGINNING to ENDING) go from the A of the ATG (or G of GTG) to the third base of the stop codon. 12) The recognition class allows storeage of restriction enzyme recognition sites (eg. ECORI, HAEIII), insertion sequence recognition sites or other sites recognized by enzymes. Each enzyme has a set of sites that it can recognize. PU = purine = A or G PY = pyrimidine = C or T N = any base = A or C or G or T * = next base is modified V = cleaveage point of enzyme ? = site sequence is unknown (X/Y) = base X or base Y is recognized 13) The first letter of each line in the library will be one of "*", "O","C","M","T","G","P","D" "R","E","S", "N", These letters make the library into a tree. Note that the nucleotides are completely segregated from these letters or words. If the DNA is stored 2 bits/NT, then DNA sections of the library would not have lines, although other parts of the library still could. (* end module page.1.2.2 *) (* begin module page.1.2.3 *) NOTES ON THE LIBRARY DEFINITION (continued) 14) The tree structure of the library allows automatic checking for proper library format. The structures which are surrounded by letters (see the previous note, not including "*") can each be read by individual subroutines. The first letter triggers the subroutine call, and the last letter triggers return from the subroutine. With this convention, access to any of the data is simple. (see appendix 1.) 15) Structures have an order from highest (largest) to lowest (smallest). For example, in order from superstructure to substructure one has: SUPERSTRUCTURE A : ORGANISM : : CHROMOSOME : : MARKER : : DNA : V SUBSTRUCTURE Organsim is a superstructure to (is larger than), chromosome and so on. The entire order is best seen in the schema. 16) There are three ways to store information about DNA: a) Storeage of a DNA sequence (piece's DNA). b) Storeage of a change to a DNA sequence (marker's DNA). c) Storeage of the recognition of a DNA sequence (enzyme's site). These are represented by the three lowest leaves of the schema. (* end module page.1.2.3 *) (* begin module page.2.1 *) BOOK DEFINITION A book is a subset of the information contained in the library. A book is requested by a user who then writes programs to analyze the data contained in the book. ::= [ * DATE OF WITHDRAWAL, DATE OF LIBRARY CREATION, BOOK TITLE [ [ [ . . . [ [ [ [ . . . [ (* end module page.2.1 *) (* begin module page.2.2 *) NOTES ON THE BOOK DEFINITION (refer to EXAMPLE BOOK) 1) The first line of the book allows one to identify the book, and can be used as a header for programs that use books. The dates are supplied by the librarian automatically, while the BOOK TITLE is written by the user. This line is exactly the same as the first line of a library. (see appendix 3.) 2) The DATE OF LIBRARY CREATION allows one to keep track of the source library, as new versions of the library are made. The DATE OF WITHDRAWAL allows one to distinguish books with the same title made a few moments apart. 3) Organism and recognition class have the same definition in both the library and a book. This means that a) any part of the library can be put into a book, b) programs that access books can have the same structure as those that access the library. (However, see Appendix 1.) 4) A book will only contain the requested DNA sequence, all other information from the library will be suppressed. This means that if one requests only a fragment of a piece from the library, then in the piece key of the book, the configuration will always be linear, and the nucleotide limits of the piece will NOT be the same as those in the library, since they will be the limits requested. The coordinate system, however, will be copied faithfully from the library to the book. 5) A linear fragment of a circular piece will have a discontinuity in its numbering system if the fragment lies over the boundary of the coordinate system. 6) A book is formally identical to a library. This allows one to create "sublibraries". The only restriction to this is that duplicate names at the same level in the tree are not allowed. (see page 3.1 2b, 7 and also appendix 3). (* end module page.2.2 *) (* begin module page.3.1 *) CATALOGUE DEFINITION (refer to EXAMPLE CATALOGUES) The purpose of the catalogue program is to check the library, integrate new structures into a new library made from the old, and to record the structure of the new library in catalogues for both people and the librarian program (see LIBRARIAN). :-------> NEW LIBRARY CATALOGUE : OLD LIBRARY -------- * -------->:-------> NEW CATALOGUE FOR LIBRARIAN PROGRAM : :-------> NEW CATALOGUES FOR HUMANS The catalogue program: 1) Reads an "old" library, which has new insertions and modifications in it. 2) Checks: a) the library format: that tree structure and variable types are correct. (see Library Definition Notes) b) for duplicated key names among the direct substructures of each structure, and flags duplicates as incorrect (or changes their names to avoid duplications, see item 7 below.) c) that genetic markers are within the range of the chromosome. d) that marker BEGINNING is not equal to its ENDING. e) that transcript direction is consistant with the transcript BEGINNING and ENDING nucleotides in linear pieces of DNA. f) that gene limits match the DNA code: BEGINNING must be ATG or GTG, ENDING must be TAG, TGA, or TAA. (A check that the number of bases is a multiple of three can be done as long as splicing by markers in eukaryotes is taken into account.) g) that a piece exists when it is refered to by markers, genes or transcripts. h) that each coordinate system BEGINNING has a numerical value less than the coordinate system ENDING. i) that the number of bases implied by the coordinate system and the piece is exactly the number found. j) that LINEAR coordinate systems have only LINEAR pieces of DNA. k) that DNA contains only A, C, G, T l) that sites contain only A, C, G, T, PU, PY, V, ?, * (and the * is followed by at least one more base), and (X/Y) where X and Y are members of A, C, G, T. 3) Produces a new library with the DNA and site data compactly reformatted. This feature integerates new inserted sections of the library with the old sections. It may also rearrange the storage of items for efficient retrieval. The dates are adjusted to show the time of creation of the new library, and the source library. (* end module page.3.1 *) (* begin module page.3.2 *) CATALOGUE DEFINITION (continued) 4) Produces a new catalogue for the librarian (see later) which a) is structured in the "outline" form of the library, b) contains only the key name, since it is needed to identify a structure. c) contains the locations of those structures in the new library (eg. file and lines from top of file if the organisms are spread over several computer files). d) is condensed so that each line (record) contains one "item": the opening key symbol (eg O for ORGANISM), name, file and file line. This allows rapid scanning of the catalogue. 5) Produces a new catalogue for humans which a) is structured in the "outline" form of the library, b) may optionally eliminate some sections of information. If listed, DNA should have tick marks and locations noted. c) may be indented and paged for easy reading. 6) Could produce an alternative catalogue in which the structures are grouped by their kind (all organism references together, all chromosome references together, etc.). This could be alphabetized so that one could easily find items by hand. A reference to the superstructure name would be useful. 7) A special option of the catalogue program allows one to turn books into libraries: a) the dates are not modified, since the book was presumably just created. b) duplicate names are eliminated. This requires a set called a 'family': those items that refer to a piece and the piece itself. The catalogue program considers all items within a chromosome prior (in the book) to each piece (and refering to the piece) as the family of that piece. In this way it can decide how to rename a duplicate family name. (* end module page.3.2 *) (* begin module page.4.1 *) LIBRARIAN The librarian program takes a set of instructions and creates a book from the library. The librarian uses the catalogue to quickly find the location of the requested keys. The actual DNA nucleotides are then determined from the keys in the library. LIBRARY ----------->: :-------> BOOK : LIBRARIAN : CATALOGUE --------->:-------- * -------->: : PROGRAM : INSTRUCTIONS ------>: :-------> INSTRUCTION LISTING INSTRUCTIONS TO THE LIBRARIAN 1) give a title to the book. (See the definition of book). Default: none. Date of withdrawal is still given. 2) have a provision to suppress copying of note keys, (and other keys) to the book. 3) Specify the various structure keys such as organism, chromosome, marker, etc. to be placed in the book. 4) specify the pieces of DNA, fragments of pieces or enzyme sites to be placed in the book. 5) provide a method for numbering the items in the book. The INSTRUCTION LISTING has two parts: The first lists the instructions in the form they are found in the INSTRUCTIONS, and indicates any errors found. If there are no errors, then The second part lists the instructions, but also indicates the numerical values of all the keys and values used in the instructions. This allows one to check that one got what one wanted, or to find out the location of a structure. It also will list any errors that arise while the parts of the book are being found, and the book is being printed. (* end module page.4.1 *) (* begin module page.5.1.1 *) DELILA INSTRUCTIONS (refer to EXAMPLE DELILA INSTRUCTIONS) The DEoxyribonucleic acid LIbrary LAnguage HUMAN CATALOGUE ------- * -------> DELILA INSTRUCTIONS As indicated by this diagram, the human is an important link in the creation of a book (but that's only natch!). One looks in the catalogue to find the items one is interested in, then one writes a set of Delila instructions and runs the librarian to obtain a book containing them. The BNF for instructions to the library is hard to read, but precise. is a part of the instruction language, whose definition can be found elsewhere in the BNF. ::= means "is defined to be". | means "or". words without brackets are part of the instruction language, taken literally. All parts of are seperated by , although this is not shown explicitly. Anywhere where there may be a , one could have a . DELILA INSTRUCTION SET DEFINITION (BNF) is a blank character or a comment is an option to leave out the part ::=0|1|2|3|4|5|6|7|8|9 ::=A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z ::=+|- ::=.|,|:|?|@|#|&|$|*|/|=|(|) ::=|||| ::= | ::= ::=| ::=| ::=| ::=| ::=""|'' ::= ::=(* *) (* end module page.5.1.1 *) (* begin module page.5.1.2 *) DELILA INSTRUCTION SET (continued) ::= ::=|TITLE; ::=| ::=|| || ::=; ::=| ::=ORGANISM|CHROMOSOME|MARKER|TRANSCRIPT|GENE|PIECE ::=RECOGNITION-CLASS|ENZYME ::=GET; ::=|| ::=FROMTO ::= ::=| ::= ::=COORDINATE|| ::=MARKER|TRANSCRIPT|GENE|PIECE ::=() ::=MARKER|TRANSCRIPT|GENE ::=BEGINNING|ENDING ::=| ::=ALL ::=| ::=CUT ::=EVERY ::=| ::=DIRECTION ::=|COMPLEMENT|HOMOLOGOUS| ::=|WITH() ::=| ::= (* end module page.5.1.2 *) (* begin module page.5.1.3 *) DELILA INSTRUCTION SET (continued) ::=IFTHENELSE ::= ::=<|<=|=|>=|>|<> ::= ::=SIZE() ::=| ::=DEFAULT; ::=| || ::=ON|OFF ::=KEY ::=NOTE|MARKER|TRANSCRIPT|GENE ::=SITE ::=EXPAND|MODIFY|CLEAVE ::=OUT-OF-RANGE ::=REDUCE-RANGE|CONTINUE|HALT ::=NUMBERING ::=|| ::=| ::=| ::=NOTE (* end module page.5.1.3 *) (* begin module page.5.2.1 *) NOTES ON THE DELILA INSTRUCTION DEFINITION (refer to EXAMPLE DELILA INSTRUCTIONS) 1) A consists of a series of instructions which tell the librarian how to make a book. Two types of instructions play a major role in Delila: a) Specification: this instruction allows one to move around the library tree. One starts at the top ("LIBRARY" in the schema), and moves by stating the type of structure one wants and its name. Example: ORGANISM ECOLI; One can only move one step down at a time, although one can move up any amount at one time. This means that some movements are not allowed, these are called Illegal Tree Traversals. b) Request: this instruction is a request for an item or piece. There are many ways to do this, so requests are complicated. There are three types: GET FROM TO ; GET ALL ; GET EVERY ; Where is somewhere on the DNA. Three instructions of lesser importance are: c) IF: which allows Delila to make choices. d) Default resets: There are a number of variables which can be set by the user. They all start out with values that I like, but if you don't like that, you can change them. e) Note insertion: The user may add notes to any node of a book with this command. 2) The , which is optional, defines the book title. The title must be in quotes. Only the first typed line of the title is inserted into the book. See the Book Definition. If one does not use this instruction, then there will be no book title. 3) Each is seperated from other instructions by semicolons; this allows a flexible format. (* end module page.5.2.1 *) (* begin module page.5.2.2 *) NOTES ON THE DELILA INSTRUCTION DEFINITION (continued) 4) The rules of specification are: (refer to notes on Library Definition, structures.) a) Specifying a structure allows access to the keys of that structure, for the purpose of requesting a piece or enzyme. b) Specifying a structure makes its substructures unspecified. c) Since a structure is specified by key name, all superstructures must have been previously specified, or there is an illegal tree traversal error. d) If a chromosome has been specified, then all those structures pointing to the piece can be specified independently of each other. This allows one to use several structures to refer simultaneously to the same piece. e) When there can only be a single substructure to a structure, then that substructure is automatically specified. Examples: DNA of marker and piece of marker. An exception to item c: one does not always need to specify the piece of DNA of markers, genes, or transcripts, since specification of their names will identify the piece through the reference to DNA. In the same way, one never specifies the DNA of markers and pieces or the sites of enzymes, since these are intimately linked together. So specification of a recognition class and then several enzymes is all that is needed to get these structures, since each enzyme may have several recognition sites, all of which must be copied to the book. Specification of a marker will not cause its DNA changes to be made, and the STATE will be recorded as OFF in the book. See in the instruction for ON markers. WHAT KEYS ARE PRINTED IN THE BOOK AND WHEN: The organism, chromosome, marker, transcript, gene, recognition class and enzyme keys are printed in the book when a instruction is executed. (Respecification, which is specification within a request, will not cause any keys to be printed.) The only exceptions to this are that the piece key is printed by the command (GET, see next section), along with the DNA, and that ON markers are always printed, since the marker key contains information on what happened to the numbering system of the piece when the marker was applied. Marker, transcript or gene keys could also be printed as a result of a GET command, when a piece has been specified. In this case, all markers, transcripts and genes which intersect the fragment of the piece requested will be printed in the book (unless turned off by the default key, see later) followed by the fragment requested. In the same way, a request of a transcript will print all genes and markers which are in the same direction as the transcript unless they are turned off. A GET ALL command will print all substructure keys in the book (unless turned off by the default key). A GET EVERY command will print every example of at the current level of specification (unless turned off). (* end module page.5.2.2 *) (* begin module page.5.2.3 *) NOTES ON THE DELILA INSTRUCTION DEFINITION (continued) 5) Once a structure has been specified, requests for fragments of DNA or structures (such as an organism) can be made. Each request will print in the book a piece (fragment), enzyme or structure as defined in the library. The DNA is always given 5' to 3'. This means that complementary sequences can be obtained simply by switching the first and last limits requested: the librarian will provide the complement 5' to 3'. (For a circular chromosome, one would also switch the direction.) There are three kinds of requests (three kinds of ): a) GET allows one to chose the fragment of DNA. is an absolute fixed location on the DNA to which may be added a offset. The piece will be LINEAR. One can ask for four kinds of absolute locations. i. one may specify the number of a base; The other three allow reference by beginning or ending (a limit) of an object: ii. the coordinate system; iii. a marker, transcript, gene or piece limit, previously specified; iv. a marker, transcript or gene not previously specified. This is a RESPECIFICATION, and it will not change any of the current specifications. b) GET prints the entire structure in the book. All substructure keys are also printed in the book (unless turned off by