Question: After assembly with Falcon
2.4 years ago
France/Bordeaux/University of Bordeaux
alexis.groppi wrote:


I'm working on a vegetal diploid genome (aroud 200 Mb) trying to obtain the best assembly with PacBIO RSII P6-C4 reads (70X coverage).

I have used Falcon and Falcon Unzip (0.7+git.2059148090374ac08a494d842dc1def105aeee50) Then I have run fc_quiver from this Falcon 0.7

I have now a "phased" assembly consisting in two files : cns_p_ctg.fasta (primary contigs) and cns_h_ctg.fasta (haplotigs) After this step, I thinks that it is a good choice to do an additional polishing with Arrow from (smrtlink/ Is this OK ?

But then I'm a bit confused. The final assembly is ONLY cns_p_ctg.fasta (primary contigs) ? or Should I merge cns_p_ctg.fasta (primary contigs) and cns_h_ctg.fasta (haplotigs) ?

I have read and heard so many diffrent things about this strategy ...

My opinion is that the final assembly would be only the primary contigs and the haplotigs (cns_h_ctg) are useful to determine the differents alleles in the diploid genome.

Am I right or completly wrong ?


I would say you're right. Given that you try to assemble a diploid heterozygous genome.

The final assembly would only consist of the primary contigs. In an homozygous (diploid) assembly you would also only result in a single haplotype. The other haplotigs are indeed to determine the different allelic variation.

Thank you for confirming my opinion and removing my doubts ;)

