Question: does Illumina read length affects PacBio error correction??
2.6 years ago
JstRoRR60 wrote:


We have a set of Illumina 250 and 100 PE sequenced eukaryotic insect clones. Now we want to further sequence some of the clones using Pacbio with 15-20X coverage. My question is which clone should I choose for PacBio sequencing? the one which was sequenced using 100 bp PE or the one with 250 bp PE? one has a better coverage for better error correction and other has longer read lengths.

Any suggestions?

written 2.6 years ago by JstRoRR60
2.6 years ago
WouterDeCoster39k wrote:

I would think that both the 100PE and 250PE have a comparable mappability/mapping quality/chance of finding a unique alignment in you contig, and as such I would go for higher coverage. Essentially, your illumina reads will not help to bridge large repeats but will help you to correct single nucleotide errors, and in that aspect, read length might be of lesser importance. I assume the difference in coverage is big enough for you to wonder about this, if the 100PE has 41M reads and the 250PE 40M, then you should go for the 250PE :p (hypothetical numbers)

written 2.6 years ago by WouterDeCoster39k

Hi WouterDeCoster, thanks for your comment. I agree that good coverage is absolutely essential but then I came across this plos paper ( where authors have suggested a term SCD (short read covered depth) that defines long reads region covered by short reads and it does has an effect on overall accuracy (error correction) of long reads. So If I have longer reads of 250 PE then I will be covering more long reads (well one can argue this is also possible with high coverage 100 PE reads). Additionally I have 20X minimum (should go up to 30x) PacBio coverage.

written 2.6 years ago by JstRoRR60

Indeed it will depend on the coverage. I you have 25M * 250PE then you 'cover' more than with 35M * 100PE...

written 2.6 years ago by WouterDeCoster39k
