MEASURING GENOME SIZE AND COMPLEXITY USING C0T CURVES

Rate of reassociation

Protocol: DEMO:  kinetic_class_demo.obj

The reassociation of DNA molecules in solution is described by the relation:


where
This equation can be transformed into the more convenient expression,

where
C0 ::= the initial[ssDNA] at time 0.
C0t ::= the value of C0t at which annealing has proceeded to half completion (C/C0=0.5).


Redrawn from Russel, P.J. (1986) Genetics Figure 7.24. Ideal time course for the renaturation of DNA as seen in a Cot (initial DNA concentration x time) plot. In the initial state the DNA is single-stranded, and in the final state it is all double-stranded. Note that 80 percent of the renaturation occurs over a 2 log Cot interval.




Semi-log plots
Remember semi-log plots? They're used for lots of things, including pH.
http://www.tiem.utk.edu/~gross/bioed/webmodules/phbuffers.html


Complexity

The complexity of a sequence is defined as the longest non-repetitive sequence that can be derived from a sequence
 
sequence complexity
AAAAAAAAAA
TTTTTTTTTT
1
ATATATATAT
TATATATATA
2
ATGATGATG
TACTACTAC
3
ATGCATGC
TACGTACG
4
ATGCCATGCC
TACGGTACGG
5

Demo: kinetic_class_demo.obj

The complexity (X) of a population of uniformly-sized DNA molecules can be measured as follows:

Rearranging the equation, we see that C 0t= X/K. Therefore, C0t increases with the complexity of the DNA. We can use this relation to measure genome sizeas shown in the figure below:


Redrawn from  Russel,P.J. (1986)Genetics Fig. 7-25a. Cot plots showing the renaturation of DNAs from organisms with small genomes: the bacterium E.coli, the bacterial viruses T2 and Lambda, and the animal virus SV40.








Redrawn from  Russel,P.J. (1986 )Genetics Fig. 7-25b. Kinetics of renaturation of DNAfrom calf thymus and E. coli as seen in a Cot plot. The E.coli DNA consists almost entirely of unique sequences. However, the shape of the Cot curve for the calf DNA is very different from that of E. coli and indicates that there are some sequences (toward the right of the curve) that renature much more slowly and some (toward the left of the curve) that renature much more quickly than the bacterial DNA sequences.


Undisplayed
              Graphic

 

The C0t curves for Calf thymus and E.coli DNA indicate that, while E.coli DNA anneals at a relatively sharp inflection point, Calf DNA contains three major kinetic classes: highly repetitive, which reanneals very early in the reaction, middle repetitive, which anneals over more than 3 log C0t, and single copy, which anneals at very high C 0 t values.

The term "single-copy" is a bit misleading, in that it refers to 1 - 10 copies per haploid genome. In fact, any distinction between single copy and middle repetitive  forces you to draw an arbitrary line. They are useful concepts because they bring out something of the content of genomes, but the definitions of highly repetitive,  middle repetitive and single-copy shouldn't be pushed too far.

Another point to mention is that although most protein coding genes are found in the single copy fraction,  not all of the single copy fraction is protein coding genes. There appears to be a lot of non-coding single copy "junk" in many eukaryotic genomes.

Calculation of complexity of a single kinetic class

Let f represent the genome fraction of a kinetic class of DNA. For example, the genome fraction f for the highly-repetitive fraction of Calf DNA is about 0.15, or 15% of the total genome. We can calculate the C0t for a genome fraction that is part of a mixture (ie. a complex genome) by the following equation This value can be plugged into the equation for complexity: By combining data for the different kinetic classes, the complexity and size for a wide range of genomes has been measured, as shown in the table:

Table 1 from Okamuro & Goldberg p8 in  The Biochemistry of Plants Vol. 15

TABLE 1. Plant and animal genome sizeand genome complexity.
Species Genome size
(kb)
Genome complexity
(kb)
Arabidopsis thaliana 7.0 x 104 5.5 x 104
Cotton (Gossypium hirsutum) 7.2 x 105 5.1 x 105
Flax (Linum usitatissimum) 1.5 x 105 6.8 x 104
Maize (Zea mays) 5.7 x 106 2.3 x 106
Mung bean (Vigna radiata) 4.7 x 105 2.6 x 105
Parsley (Petroselinum sativum) 3.8 x 106 1.3 x 106
Pea (Pisum sativum) 4.5 x 106 1.3 x 106
Pearl millet (Pennisetum americanum) 3.8 x 105 1.0 x 105
Soybean (Glycine max 1.3 x 106
1.8 x 106
6.9 x 105
7.3 x 105
Tobacco (Nicotiana tabacum) 1.5 x 106
2.4 x 106
6.4 x 105
1.0 x 106
Wheat (Triticum aestivum) 5.2 x 106 6.2 x 105
Man (Homo sapiens) 3.0 x 106 1.0 x 106
Mouse (Mus musculus) 1.6 x 106 9.1 x 105
Fruit fly (Drosophilia melanogaster) 1.5 x 105 1.1 x 105
Nematode worm (Caernorhabditis elegans ) 8.0 x 104 7.0 x 104
Water mold (Achyla bisexualis) 4.2 x 104 3.4 x 104
Escherichia coli 4.2 x 103 4.2 x 103

Unless otherwise cited or referenced, all content on this page is licensed under the Creative Commons License Attribution Share-Alike 2.5 Canada

Return to "Reassociation of DNA"