Choosing the right contig
0
1
Entering edit mode
9.6 years ago
ptsourkas ▴ 10

I am trying to do de novo assembly of phage genomes. I have obtained the reads from the sequencer, but our collaborator who did the sequencing wasn't able to fix the distance between paired end reads. I was told the distance between paired end reads ranges between 200 and 1000 bp. So I vary the expected distance between paired end reads when I assemble them. I use values of 300, 400, 500, 600, and 700. Problem is, in some instances, the contigs I get vary wildly with the value of the expected distance between reads. Now, I know that my contigs should be in the 40-45 kbp range, so I use that as a guide. But in some instances, I get contigs of 40-45 kbp that are totally dissimilar depending on the expected distance and I can't decide which one to choose.

For one phage I get a contig of length 45 kbp with expected distance 300 and a different contig of length 45 kbp with expected distance 400, 500, 600, 700 (which is the same contig for 400, 500, 600 and 700)

For another phage I get one 40 kbp contig with expected distance 300, 400, and 700, and a totall different 40 kbp contig with 500, 600

For yet another phage, I get one contig with expected distance 300, 400, 600, and 700, and a totally different, but no less plausible contig for expected distance 500.

So the question is, how do I choose the right contig?

Assembly • 1.9k views
ADD COMMENT
1
Entering edit mode

Another choice worth trying is to ignore the pairing and and treat the data as single end sequences and see what you get that way, without biasing the measurements one way or another. Maybe those contigs will match to one of the existing ones providing some support. Or it could be again completely different.

ADD REPLY
0
Entering edit mode

Excellent suggestion. I will try that.

ADD REPLY

Login before adding your answer.

Traffic: 2526 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6