What Kinds Of Efficient Algorithms Can Be Used To Correct Read Errors
1
1
Entering edit mode
11.0 years ago
Snowball ▴ 10

This error correction problem caught my eye. Is there an O(n) algorithm for this, where n is the number of reads? There's the relatively straightforward O(n2) method of checking all the pairs, but I'm wondering if there's something better out there.

sequencing error read • 1.8k views
0
Entering edit mode

Forgive my ignorance, but I am not aware of the "O" term in the algorithm you mention, can you give us some more information on this method?

5
Entering edit mode

big O notation is just a way to describe how an algorithm performs given an input. You can read more about it here: http://en.wikipedia.org/wiki/Big_O_notation

For example, looking for an element in a dictionary/hash structure would be O(1) because you can directly access it in one operation. Whereas a primitive search algorithm where you iterate through an entire list of elements to find a specific element would be O(n) where n is the number of members in the list. A binary search tree where you effectively reduce your search space by 1/2 every node would give you O(logN) processes.

It's basically a way to describe the performance of an algorithm

1
Entering edit mode

+1 and thanks Damian! I come from a more strictly biological background so I need a little help with things sometimes.

5
Entering edit mode
11.0 years ago
jts ▴ 240

k-mer based error correction is much faster than overlap based methods. The complexity is effectively O(N) where N is the total number of bases. See the quake paper for a representative approach. Many other programs use a similar approach.