Database bias - Experimental plan

Experimental plan

Procedures:

Identification of biases in representative databases (eg. GenBank, SwissProt)
Development of metrics for measurement of bias
Create Datasets

Real datasets (reflect real data)

Simulated datasets (allow you to control biases)

Test effects of biases on real and simulated data
Improvement of existing methods

What we hope to learn:

Which kinds of biases exist?
Which ones are important and which can we ignore?
How do we make better datasets?
How do we improve analytical methods?

FRISTENSKY LAB