PLNT4610/PLNT7690 Bioinformatics

DATABASE PROJECT - SCHEMA


A sample database schema

A database schema is a model of data. It defines


Fundamental Data Types

A database is built from a small number of fundamental data types. The datatypes available in ACeDB are listed in the table.

Type
Description
Examples
Tag
The name for a data field, to be shown in the display. Tags can not contain blanks.
Common_name
Text
short free text, often one word but never more than a single line.
oilseed rape
LongText
Intended for use in long comments or descriptions
The library was sent by surface mail and therfore spent several days at room temperature. Some decrease in titre may have occurred.
Int
positive or negative integer
17
-123
0
Float
A real number in floating-point notation
0.58
1.2e+09
UNIQUE
an enumerated choice; This means that the data field can only be set to one of a small list of choices
          Linear
          Circular
Date
date and time
04-12-10
04-12
04
04-12-10_13:23:00
04-12_18:00
NOW
TODAY
pointer to a Class
pointer to an object of another class. A class name can not contain blanks. Pointers work like hypertext links.
pBI121

Calling external programs

One unique capability of ACeDB is the ability to call external programs. If a class contains the tag 'Pick_me_to_call' followed by one or more 'Text' items, a Unix command will be created by stringing together the 'Text' items.  For example, an object might contain a line like

Pick_me_to_call   eog   pBR322.gif

Double-clicking on Pick_me_to_call would tell ACeDB to issue the command 'eog pBR322.gif', which starts the eog image viewer, displaying the graphic image file pBR322.gif. The only problem with this approach is that you are embedding the name of the eog program into the database. This means that every entry using eog would have to be changed if you wanted to switch to a new graphics program. To get around this, the ace4db script sets the environment variable

ACE_FILE_LAUNCHER=chooseviewer

The BIRCH chooseviewer script selects a program to launch, depending on the file extension. If the file extension is not known to chooseviewer, the file will be opened using a text editor. So for most kinds of file, simply use $ACE_FILE_LAUNCHER as the program for viewing files. 

File type
Model/Object
Text
Pick_me_to_call Text Text
Pick_me_to_call $ACE_FILE_LAUNCHER $ACEDB/externalFiles/exp75.txt
Graphic image
Pick_me_to_call Text Text
Pick_me_to_call $ACE_FILE_LAUNCHER $ACEDB/externalFiles/pBR322.gif
PDF
Pick_me_to_call Text Text
Pick_me_to_call $ACE_FILE_LAUNCHER $ACEDB/externalFiles/plasmidprotocol.pdf
HTML
(external)
Pick_me_to_call Text Text
Pick_me_to_call $ACE_FILE_LAUNCHER http://www.ncbi.nlm.nih.gov/projects/genome/guide/cow
Note that all of the examples above specify a complete path to the file to be read. The environment variable $ACEDB points to the main directory in which your ACeDB database is stored. These examples assume you have a subdirectory called 'externalFiles' containing files that belong to the database, but are not part of the binary database itself. In all cases, when one of these files is read, it is copied to a temporary file in the directory from which you launched ACeDB. If you want to save a copy of the file, you need to use 'Save As' to give it a new name, because the temporary file will be automatically deleted when you quit the file viewer.

Guidelines for schema design


1. The database is a model of a biological or experimental system. Make it as close to the real system as possible.

2. Keep each class simple. The fewer fields, the better.

3. Avoid having numerous links to a single class.

4. Do not duplicate the same piece of information in more than one object.

5. Wherever practical, avoid free text. Use links or enumerated choices.

6. In the schema the field must ONLY be one of the eight data types shown in the Fundamental Data Types table.

Think of a database the same way you think of your home. You put cooking items in kitchen drawers, not in the bathroom cabinet. Towels would go into the hall closet, not in the garage. Apply this same logic to your database, and things will fall into place.

Assignment: Create your own database schema

(Note: If you are registered for PLNT4610, just substitute "PLNT4610" for "PLNT7690" in the
steps listed below.)

The ultimate goal of this project is to create a database using information of your choosing.  Examples of topics might include:


1. Create the following directories within your public_html/PLNT7690 directory:

public_html/PLNT7690/as4/
public_html/PLNT7690/as4/schema

Make sure these directories are world-readable and world-searchable.


2.  Save the file schema_template.odg to your public_html/PLNT7690/as4/schema directory.  As a safety precaution, rename this file schema.odg. That way, you can't accidentally download another copy of the file and overwrite all your work.

3.  You can edit this file using Office -->  LibreOffice Draw.

4. Remove the comments (blue, green and pink text and arrows).

5. Create your own data objects, using existing data objects as templates. Depending on the organization of the database, it may be useful to change the page orientation from 'portrait' to 'landscape'.

Note: Because the next part of the project will be to implement the database using real data, plan your database so that you do NOT need to include any information that is proprietary or might be considered intellectual property.

6. When you are finished with your schema, print the schema to a graphic file in .png format (eg. schema.png) as follows: File --> Export. Set File type to png. You should be able to view the graphic by clicking on it in the file manager. 

7. Create a web page named public_html/PLNT7690/as4/as4.html. This will be the web page for the entire database assignment. On this page, make a link to public_html//PLNT7690/as4/schema/schema.html.

8. Create a web page named public_html/PLNT7690/as4/schema/schema.html. This will be the page describing your schema. It should contain the image saved in step 6 (eg. schema.png), along with a brief description of what the overall schema is intended to represent. This may be all you need to put onto the web page. However, you may also wish to comment on why you did things a certain way if the reason is not obvious, or which data classes were difficult to represent, and how you decided on your choice.

At the end, also make a link to schema.odg so that I can take a look at the original file.

9. You will 'hand in' the assignment simply by having all files ready by the due date. I will go to each directory and look at the files online. Remember, all files must be world-readable, and all directories world-readable and world-executable. Test all links before the due date!

10. I will send you feedback on how you might improve your schema. When you get the feedback, you can go on to the next step of implementing the database using ACeDB.