PLNT4610/PLNT7690 Bioinformatics

DATABASE PROJECT - SCHEMA


A sample database schema

A database schema is a model of data. It defines

Brief description:

This schema describes the various transgenic plant lines in a laboratory collection. Each transgenic line is denoted by a Genotype object, which tells which Species the plant belongs to, which Plasmid was used to create the transgenic line, and which Seed_stocks are available for that line.

Each Experiment object documents the Seed_stocks produced, as well as the person (Author) who did the work. Publications for each Author can also be included.

IMPORTANT: Each class in the schema is an abstraction. It is NOT an example of an object. Thus, it would be incorrect to for the schema to show the Seed_stock class with specific values in each field:

Seed_stock   AM312.Westar43
Experiment   AM312
Line         Westar
Container    Box12


The above is an object, not a class.



Fundamental Data Types

A database is built from a small number of fundamental data types. The datatypes available in ACeDB are listed in the table.

Type
Description
Examples
Tag
The name for a data field, to be shown in the display. Tags can not contain blanks.
Common_name
Text
short free text, often one word but never more than a single line.
oilseed rape
LongText
Intended for use in long comments or descriptions
The library was sent by surface mail and therfore spent several days at room temperature. Some decrease in titre may have occurred.
Int
positive or negative integer
17
-123
0
Float
A real number in floating-point notation
0.58
1.2e+09
UNIQUE
an enumerated choice; This means that the data field can only be set to one of a small list of choices
          Linear
          Circular
DateType
date and time

This type is an exception to normal ACeCB syntax. The models.wrm file must implement a date without the usual ? preceeding the DateType, and it must be unique. Example

StartDate   UNIQUE DateType
FinishDate  UNIQUE DateType
04-12-10
2004-12-10
04-12
04
04-12-10_13:23:00
04-12_18:00
NOW
TODAY
pointer to a Class
pointer to an object of another class. A class name can not contain blanks. Pointers work like hypertext links.
pBI121
See  ACEDB Manual (797k, PDF)   for full documentation.

Calling external programs

One unique capability of ACeDB is the ability to call external programs. If a class contains the tag 'Pick_me_to_call' followed by one or more 'Text' items, a Unix command will be created by stringing together the 'Text' items.  For example, an object might contain a line like

Pick_me_to_call   eog   pBR322.gif

Double-clicking on Pick_me_to_call would tell ACeDB to issue the command 'eog pBR322.gif', which starts the eog image viewer, displaying the graphic image file pBR322.gif. The only problem with this approach is that you are embedding the name of the eog program into the database. This means that every entry using eog would have to be changed if you wanted to switch to a new graphics program. To get around this, the ace4db script sets the environment variable

ACE_FILE_LAUNCHER=chooseviewer

The BIRCH chooseviewer script selects a program to launch, depending on the file extension. If the file extension is not known to chooseviewer, the file will be opened using a text editor. So for most kinds of file, simply use $ACE_FILE_LAUNCHER as the program for viewing files. 

File type
Model/Object
Text
Pick_me_to_call Text Text
Pick_me_to_call $ACE_FILE_LAUNCHER $ACEDB/externalFiles/exp75.txt
Graphic image
Pick_me_to_call Text Text
Pick_me_to_call $ACE_FILE_LAUNCHER $ACEDB/externalFiles/pBR322.gif
PDF
Pick_me_to_call Text Text
Pick_me_to_call $ACE_FILE_LAUNCHER $ACEDB/externalFiles/plasmidprotocol.pdf
HTML
(external)
Pick_me_to_call Text Text
Pick_me_to_call $ACE_FILE_LAUNCHER http://www.ncbi.nlm.nih.gov/projects/genome/guide/cow
Note that all of the examples above specify a complete path to the file to be read. The environment variable $ACEDB points to the main directory in which your ACeDB database is stored. These examples assume you have a subdirectory called 'externalFiles' containing files that belong to the database, but are not part of the binary database itself. In all cases, when one of these files is read, it is copied to a temporary file in the directory from which you launched ACeDB. If you want to save a copy of the file, you need to use 'Save As' to give it a new name, because the temporary file will be automatically deleted when you quit the file viewer.

Guidelines for schema design


1. The database is a model of a biological or experimental system. Make it as close to the real system as possible.

2. Keep each class simple. The fewer fields, the better.

3. Avoid having numerous links to a single class.

4. Do not duplicate the same piece of information in more than one object.

5. Wherever practical, avoid free text. Use links or enumerated choices.

6. In the schema the field must ONLY be one of the eight data types shown in the Fundamental Data Types table.

Think of a database the same way you think of your home. You put cooking items in kitchen drawers, not in the bathroom cabinet. Towels would go into the hall closet, not in the garage. Apply this same logic to your database, and things will fall into place.

Assignment: Create your own database schema

(Note: If you are registered for PLNT7690, just substitute "PLNT7690" for "PLNT4610" in the
steps listed below.)

The ultimate goal of this project is to create a database using information of your choosing.  Examples of topics might include:

The best topic for a database is some topic in biology that you know a lot about. It can be in any field of biology that you wish.

First, spend some time just thinking about how the data related to your topic is structured. What are the most important things that you want to represent as objects? What are the relationships between objects in your data? After you have a pretty good understanding of your data, then decide how you will represent those things in your schema.

What is NOT acceptable for a database topic:

  • A trivial modification of a database used in class eg. pathace, acedemo
  • A database encoding taxonomic ranks (eg. Domain-Kingdom-Phylum etc.) will not be acceptible
    unless it is it demonstrates a creative, original representation of taxonomic relationships.
However, you may re-use some of the pre-existing classes. In particular, the File class could probably be of use for almost any database.



1. Create the following directories within your public_html/PLNT4610 directory:

public_html/PLNT4610/as3/
public_html/PLNT4610/as3/schema

Make sure these directories are world-readable and world-searchable.


2.  Save the file schema_template.odg to your public_html/PLNT4610/as3/schema directory.  As a safety precaution, rename this file schema.odg. That way, you can't accidentally download another copy of the file and overwrite all your work.

3.  You can edit this file using Office -->  LibreOffice Draw.

4. Remove the comments (blue, green and pink text and arrows).

5. Create your own data objects, using existing data objects as templates. Depending on the organization of the database, it may be useful to change the page orientation from 'portrait' to 'landscape'.

Note: Because the next part of the project will be to implement the database using real data, plan your database so that you do NOT need to include any information that is proprietary or might be considered intellectual property.

6. When you are finished with your schema, export the schema to a graphic file in .png format (eg. schema.png) as follows: In LibreOffice Draw, choose Edit --> Select AllFile --> Export. Make sure the Selection box is checked. Set File Format to png. Click on Save. You should be able to view the graphic by clicking on it in the file manager. 

7. Create a web page named public_html/PLNT4610/as3/as3.html. This will be the web page for the entire database assignment. On this page, make a link to public_html//PLNT4610/as3/schema/schema.html.

8. Create a web page named public_html/PLNT4610/as3/schema/schema.html. This will be the page describing your schema. It should contain the image saved in step 6 (eg. schema.png), along with a brief description of what the overall schema is intended to represent.  (An example of what the brief description might look like is shown at the top of this page.) This may be all you need to put onto the web page. However, you may also wish to comment on why you did things a certain way if the reason is not obvious, or which data classes were difficult to represent, and how you decided on your choice.

At the end, also make a link to schema.odg so that I can take a look at the original file.

TEST your Assignment 3 page to make sure that you can see it from the web server ie. http://home.cc.umanitoba.ca......

9. At this point, send me an email indicating that your  I will go to your Assignment 3 page and look at the files online. Remember, all files must be world-readable, and all directories world-readable and world-executable.

10. I will send you feedback on how you might improve your schema. When you get the feedback, you can go on to the next step of implementing the database using ACeDB.