BIOINFORMATICS WILL



enhance the impact of data

conservation of data

structuring of data into knowledge



change the way we approach problems



open new lines of inquiry



DATABASES ARE TO DATA AS CONSERVATION IS TO ENERGY



HOW WE LOSE DATA:

too many journals, too little library

"obscure" journals

papers not machine-readable

raw data not available

only abstracts are indexed

inconsistent nomenclature

HOW WE CONSERVE DATA

electronic libraries; pay per article

no electronic journal is obscure

electronic papers machine readable

raw data available

entire paper can be indexed

nomenclature stardards; links clarify

relationships

DATA VS. KNOWLEDGE



Knowledge is an informational model consisting of data items and the relationships between them.



knowledge
data item data item

data item data item data item

data item







Bioinformatics is the art of encoding biological data into knowledge.



bioinformatics

data


knowledge







URL:



http:/www.genome.ad.jp/htbin/bget_ligand?4.3.1.5



WEB PAGE:





ENTRY EC 4.3.1.5

NAME Phenylalanine ammonia-lyase

CLASS Lyases

Carbon-nitrogen lyases

Ammonia-lyases

SYSNAME L-Phenylalanine ammonia-lyase

REACTION L-Phenylalanine = trans-Cinnamate + NH3

SUBSTRATE L-Phenylalanine

L-Tyrosine

PRODUCT trans-Cinnamate

NH3

trans-4-Hydroxycinnamate

COMMENT This enzyme may also act on L-tyrosine.

PATHWAY PATH: MAP00360 Phenylalanine and tyrosine metabolism (path 2)

PATH: MAP00370 Phenylalanine and tyrosine metabolism (path 3)

MOTIF PS: PS00488 G-[STG]-[LIVM]-[STG]-[AC]-S-G-[DH]-L-x-P-L-[SA]-x(2)-

[SA]

DBLINKS University of Geneva ENZYME DATA BANK: 4.3.1.5

PIR: A24727 A29607 A44133 A56628 JQ1070 JQ2265 PQ0140

S01999 S04127 S04128 S04129 S04463 S06475 S17444

S18352 S20005 S21174 S22991 S25303 S25538 S29029

S48726 S52990 S52991 S52992 S56033

///



----------------------------------------------------------------------------

DBGET integrated database retrieval system, GenomeNet (Kyoto center)



FEATURE TABLE:



CDS join(M38619:160..256,M38620:11..307,

M38621:11..179,M38622:11..176,

M38623:11..250,11..103)

/product="green visual pigment"

/gene="G101"

/codon_start=1





RESULTANT SEQUENCE:



>ASYPIGG6:CDS1

atggccgcacacgagcctgtgttcgccgcccggcgccacaatgaagacac

cacaagggagtctgcatttgtctacacaaatgctaataatacaagag

atccttttgaaggaccaaactatcacattgcccctcgatgggtctacaac

gtatcatccttatggatgatctttgttgtcattgcatcagtcttcactaa

tggtttggtaattgtagcaacagcaaagttcaagaagctgcgacaccctc

taaactggattctggtaaacctggctatagccgatctcggggagacagtt

cttgccagcacaatcagtgtcatcaaccagatcttcggctacttcatcct

tggacacccaatgtgcgtttttgaggggtggacggtgtctgtctgtg

gtatcacagctctgtggtctctgactataatctcctgggagcgctgggtg

gttgtgtgcaagccatttggaaatgttaaattcgatggcaaatgggcagc

aggtggcatcatcttctcctgggtttgggccatcatctggtgcacccctc

caatctttggctggagcag

gtactggccccatggtctgaagacatcctgtggccctgatgtgttcagtg

gcagtgaggatccaggagtggcctcctacatgatcaccctaatgcttacc

tgctgtattcttcctctgtccatcattatcatttgctacatttttgtctg

gagtgccatccaccag

gtcgcccagcagcagaaagactcagagtccactcagaaggcagagaagga

agtgtccaggatggtggtagtgatgatccttgcctttattgtgtgctggg

gaccatatgcctcctttgccaccttctctgcagtgaacccaggttatgcc

tggcacccactggcagccgctatgcccgcttacttcgccaagagtgccac

catctacaatcccatcatttacgtcttcatgaaccgccag

ttccggagctgtatcatgcagctgtttggaaagaaggtggaggatgcatc

agaggtttccggctctaccacagaagtttctacagcctcgtaa



CPGN: FEATURE TABLE:



Arabidopsis thaliana

Lhcb1.Ara.tha:1 LHCP AB 165, cab2

Lhcb1.Ara.tha:2 Lhcp2, cab2

Lhcb1.Ara.tha:3 LHCP AB 140, CAB1A, cab1

Lhcb1.Ara.tha:4 Lhb1B1, CAB6P

Lhcb1.Ara.tha:5 Lhb1B2, CAB6P



Cucumus sativus (cucumber)

Lhcb1.Cuc.sat:1 LHCPB

Lhcb1.Cuc.sat:2 LHCPA



Zea mays (maize)

Lhcb1.Zea.may:1 LhCp

Lhcb1.Zea.may:2 Cab-1

Lhcb1.Zea.may:3 LhcabB

Lhcb1.Zea.may:4 cab-ml

Lhcb1.Zea.may:5 zmcab48





SolGenes - Tomato introgression line



[view graphic]



In_introgression_line Z4-,2

Description Lp/Le Introgression Set

Position Map Tomato-4 Ends Left 32.9

Right 40

Drawing Colour BLUE

Label Locus Enzyme Buffer

Contains Pgm-2 None L-TRIS 7.7

TG182 EcoRV

TG146A EcoRV

TG146C EcoRV

TG483 DraI

TG146B EcoRV

TG123 DraI

TG413B EcoRV

Does_not_Contain TG15 EcoRV

TG208 EcoRV



1. Search for EST's from Brassica campestris:

























2. Search for EST's from flowers:























3. AND the set.



comm -12 flower@est.nam campestris@est.nam > both.nam



4. Further refinement as above