Question: Best tools for genome assembly with paired-end, mate pair and linked read (10x) data
0
gravatar for hirad.alipanah
18 months ago by
hirad.alipanah10 wrote:

Hi everyone. We have the following table for our data: Data Table

We want to assemble the genome of the planaria "dugesia japonica". Its genome is approximately 1.5Gbp, diploid and very repetitive (similar to Smed.) What is the best assembler that you suggest for assembling all of this data?

assembly genome • 672 views
ADD COMMENTlink modified 18 months ago • written 18 months ago by hirad.alipanah10

Could you clarifiy on which species you want to perform the assembly ?

ADD REPLYlink written 18 months ago by Nicolas Rosewick8.7k

Yes. the planaria "dugesia japonica"

ADD REPLYlink written 18 months ago by hirad.alipanah10
1

Could you edit your question to add these information + expect size of genome + ploidy , etc... Thanks

ADD REPLYlink written 18 months ago by Nicolas Rosewick8.7k

10x data may need to be handled separately. supernova is what you would want to use there. I see some data from GAII which would lead me to believe that you have collected different data over time. Are all these datasets for the same exact sample/organism?

ADD REPLYlink written 18 months ago by genomax80k

Yeah, that was our first option. But how do we use the output of supernova for our other data? Yes, the datasets are from the same exact organism. But they are from different samples.

ADD REPLYlink written 17 months ago by hirad.alipanah10

I can recommend ABySS , very versatile, excellent cluster usage and quite performant, it might require some parameter tweaking though (as with most assembly software). From the same developers there are also tools to include the 10x data.

ADD REPLYlink written 18 months ago by lieven.sterck7.3k

Yes. We've considered that, too. But it needs a lot of memory (around 1TB but we only have 500GB.) Do you know other assemblers that require less memory?

ADD REPLYlink written 17 months ago by hirad.alipanah10

perhaps soapDeNovo is an option ? (no experience with myself though). Masurca will likely also be too mem intensive. I think in most cases you still need to figure out how to include the 10x as there are very few (to none?) software that will be able to process all your data at once.

ADD REPLYlink written 17 months ago by lieven.sterck7.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1981 users visited in the last hour