Question: De novo plant genome assembly
2
gravatar for alslonik
2.3 years ago by
alslonik150
Israel
alslonik150 wrote:

Hello clever community!

I need your advice. I am working on a de novo plant genome assembly of ~400 Mb. I have Chromium 10x data, which was assembled with supernova. I also have Illumina paired end reads. Now I have additional data of PacBio reads, 120x roughly. The genome is diploid and I am thinking about using Falcon.

What do you think should be the best strategy:

  1. Assembling PacBio reads and then using a tool to integrate the two assemblies? Is there anything like this? Which tool would you use?

  2. Using a tool that can assemble the genome from both the chromium and the PacBio reads? Is there anything like it?

  3. Assembling the PacBio reads and using chromium 10x and the illumina for polishing? If I assemble with Falcon, what tool should I use for polishing?

4? Anything else that I am missing to get the best out of what I can get?

Thank you very much in advance! Alex

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by alslonik150

Give a look to the BioNano optical maps and its use in getting an assembled genome

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Antonio R. Franco4.5k

What do the results of the chromium assembly look like? What about the Illumina PE reads? Have you tried to assemble them? It would be useful to see some stats of what those two assemblies look like.

Here is a nice tutorial about how to polish PacBio assemblie: Polish PacBio assembly with latest PacBio tools : an affordable solution for everyone

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by genomax91k

Thanks. Re: 10x assembly: it is ~200Mb size after ordering with ALLMAPs which is 2/3 of the expected size. BUSCO shows 88% complete plant BUSCOs. Illumina repeats were never successfully assembled. I actually am thinking of doing it now.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by alslonik150

DBG2OLC is a hybrid assembler.

ADD REPLYlink written 2.2 years ago by Ric300

Thanks, Ric. will check it out.

ADD REPLYlink written 2.2 years ago by alslonik150
2
gravatar for lieven.sterck
2.3 years ago by
lieven.sterck8.7k
VIB, Ghent, Belgium
lieven.sterck8.7k wrote:

Falcon is not a bad choice, an alternative might be Canu (if you have the computational resources for it)

1) MEDUSA (as well as QuickMerge) is one of those integrating assembly/scaffolding tools

3) Pilon, Arrow, and there will be others I guess

4) Canu, but with the same remark as Carambakaracho for MaSuRCa

ADD COMMENTlink modified 2.2 years ago • written 2.3 years ago by lieven.sterck8.7k
2
gravatar for Rox
2.3 years ago by
Rox1.2k
France / Toulouse / GeT-Plage
Rox1.2k wrote:

Hello again alslonik !

Here I'll add my little pinch of salt and recommend you having a look on that great manual : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5100563/ . Of course it was not tested on plant genome, but it help you to orient your choice concerning assembly strategy depending on the technology you used and on your sequencing depth.

I already saw that you wanted to give it a try to quickmerge so you may have already saw that manual. As I tried quickmerge myself with 2 different PacBio only assembly, I have to say I was really satisfied with the result of quickmerge concerning contiguity and completeness. As you did a Falcon assembly, you can try merging a Falcon assembly and a Canu assembly, it may give some improvement as well, if you have the time of trying that of course !

Cheers,

Roxane

ADD COMMENTlink written 2.3 years ago by Rox1.2k
1
gravatar for Carambakaracho
2.3 years ago by
Carambakaracho2.2k
Germany/Cologne
Carambakaracho2.2k wrote:

I don't have any experience with Falcon, so I can't help on that, but this is my advice on your other questions. In any case, your PacBio coverage is quite decent, so you might expect relatively good results from the Falcon assembly.

  1. Integration of assemblies is usually not trivial - though I might just lack a good reference.
  2. SPAdes can integrate all that data afaik. You can use the chromium contigs as "trusted contigs"
  3. No experience, but my guess is you risk to polish out any heterzygosity.
  4. MaSuRCa can handle both PacBio and Illumina - however, you won't be able to use the chromium data directly.
ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by Carambakaracho2.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1274 users visited in the last hour