Question: de bruijn graph for paired end reads
0
gravatar for Star70
3.1 years ago by
Star700
Star700 wrote:

I have a set of paired end reads from an unknown species. I want to assemble them to find the main genome. I have a program based on the basic de bruijn graph algorithm which is for single reads. Are there any other classic algorithm which works better in the case of paired end reads.? (Or any other version of de bruijn graph algorithm) Also I have 3 versions of these reads from 3 sample of the same species. Could them help me to improve my assembler?

rna-seq • 920 views
ADD COMMENTlink modified 3.1 years ago by Philipp Bayer6.7k • written 3.1 years ago by Star700
1
gravatar for Philipp Bayer
3.1 years ago by
Philipp Bayer6.7k
Australia/Perth/UWA
Philipp Bayer6.7k wrote:

There are heaps of De Bruijin graph implementations for paired end reads:

Velvet https://www.ebi.ac.uk/~zerbino/velvet/

ABySS http://www.bcgsc.ca/platform/bioinfo/software/abyss

MasuRCA http://www.genome.umd.edu/masurca.html

DISCOVAR (needs 250 bp paired reads, PCR-free) https://software.broadinstitute.org/software/discovar/blog/?page_id=23

SOAPdenovo http://soap.genomics.org.cn/soapdenovo.html

ALLPATHS-LG (needs matepaired data) http://software.broadinstitute.org/allpaths-lg/blog/

and many many many more

As for your three versions from three samples, do you expect the samples to be different? If they're replicates (identical) then you can treat them as three libraries in the assembler and assemble it all together.

ADD COMMENTlink written 3.1 years ago by Philipp Bayer6.7k
1

I'd like to also mention SPAdes, which is the only De Bruijin graph assembler I'm aware of that actually makes the graph using paired k-mers (k-bimers) from paired reads.

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Brian Bushnell17k
1

Thanks, I missed that! SPAdes has given me really good (best?) results with smaller (nonplant) genomes

ADD REPLYlink written 3.1 years ago by Philipp Bayer6.7k

It consistently gives us the best results with bacteria/archaea. We also routinely use it for metagenomes, but the resource requirements are so much higher than Megahit that it is only usable on low-complexity metagenomes, or metagenomes that have been highly processed (normalized, error-corrected, and low-depth reads removed). I've never tried it on a eukaryote, but it would not surprise me if it did a good job. Particularly on haploid euks like some fungi we assemble.

ADD REPLYlink written 3.1 years ago by Brian Bushnell17k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1509 users visited in the last hour