Database similarity
Pairwise comparisons of a query sequence with every sequence in a database
Objective: To identify homologous genes/proteins ie. genes related through common ancestry
Efficiency (dynamic programming): O(mn)
m - length of query sequence
n - length of database
Efficiency (dynamic programming w/lookup table): O(mn/k)
Evaluating the statistical significance of hits
Dynamic programming methods have sped up using
- threading on multi-CPU systems
- distributed computing
- supercomputers with large numbers of processors specialized for similarity comparison
http://home.cc.umanitoba.ca/~frist/Seminars/iims02/iims02.html