Estimating Assembly Error Rate For Given Sequencing Depth
1
1
Entering edit mode
10.5 years ago
Leszek 4.2k

According to Illumina, error rate (calling wrong base in single read) for GAIIx is ~1%.
Could you help me to estimate, what would be the error rate (probability of calling wrong base) in the genome assembly at let's say 100X coverage? Is it simply 1%/100?

illumina • 4.0k views
ADD COMMENT
1
Entering edit mode

Wouldn't it be 1%^100? Of course, the error rate actually changes as a function of base-postion in the read length and then there are the Phred scores to think of, so I suspect the proper equation would be quite messy.

Edit: Err, 1%^100 would be the naive probability that all of the reads covering a base contain an error. Of course, you don't actually need all of them to contain an error and they wouldn't then all contain the same error. Mea culpa!

ADD REPLY
2
Entering edit mode
10.5 years ago

You can't evaluate assemblies this way because the errors in reads can cause mis-assemblies where the effect cannot be described with classical probabilities.

The theoretical problem of most reads ending up with the same error at the exact same position by sheer chance will be so small that is not worth accounting for.

This is not to say that this event does not happen, it is just that when it does it won't be due to random chance but a systematic problem in which case probabilistic estimation does not help.

ADD COMMENT
1
Entering edit mode

That is right, there are weak relationships between minor errors in sequencing and errors in assembly -- even with hundreds of X coverage, a "simple" problem of repeated sequence could affect the quality of a resulted assembly significantly. Thus, I would also guess that the minor sequencing error (100X coverage and 1% error) is dismissible.

ADD REPLY

Login before adding your answer.

Traffic: 1706 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6