Question: inputs to genomicconsensus arrow algorithm
gravatar for peri.tobias
13 months ago by
peri.tobias10 wrote:

Firstly apologies, this is a cross-post from here as I was not sure if I had sent to the correct forum. If I get a useful answer I will make sure it is across both platforms.

I have assembled a de novo genome (1.98 Gb) with canu v1.6 using pacbio reads. I am in the polish stage and have aligned raw subreads.bam to the assembly with blasr in batches, as the process was running out of allocated walltime when all reads submitted. I therefore have 6 large alignment.bam files 342G, 284G, 240G, 117G, 154G, 78G.

I was trying to merge the bam files using pbmerge, however this too was a very long process and at one stage failed. Is it possible to run these individual alignment.bam files as inputs with the arrow algorithm and get 6 fasta outputs? My thinking is that these are going to be much smaller files to merge but I am not sure if this is valid.

Alternatively, is there a more efficient method to do the genomic consensus? I had data from both RSII and Sequel as starting files.

pacbio arrow assembly polish • 603 views
ADD COMMENTlink modified 8 months ago by harish240 • written 13 months ago by peri.tobias10
gravatar for harish
8 months ago by
harish240 wrote:


Why not slice the bam files to submit each contig as it's own reference to run arrow? After this you can merge back the genome fasta.

ADD COMMENTlink written 8 months ago by harish240
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1723 users visited in the last hour