Question: inputs to genomicconsensus arrow algorithm
0
gravatar for peri.tobias
13 months ago by
peri.tobias10
peri.tobias10 wrote:

Firstly apologies, this is a cross-post from here as I was not sure if I had sent to the correct forum. If I get a useful answer I will make sure it is across both platforms.

https://github.com/PacificBiosciences/pbcore/issues/118

I have assembled a de novo genome (1.98 Gb) with canu v1.6 using pacbio reads. I am in the polish stage and have aligned raw subreads.bam to the assembly with blasr in batches, as the process was running out of allocated walltime when all reads submitted. I therefore have 6 large alignment.bam files 342G, 284G, 240G, 117G, 154G, 78G.

I was trying to merge the bam files using pbmerge, however this too was a very long process and at one stage failed. Is it possible to run these individual alignment.bam files as inputs with the arrow algorithm and get 6 fasta outputs? My thinking is that these are going to be much smaller files to merge but I am not sure if this is valid.

Alternatively, is there a more efficient method to do the genomic consensus? I had data from both RSII and Sequel as starting files.

pacbio arrow assembly polish • 603 views
ADD COMMENTlink modified 8 months ago by harish240 • written 13 months ago by peri.tobias10
0
gravatar for harish
8 months ago by
harish240
harish240 wrote:

Hi,

Why not slice the bam files to submit each contig as it's own reference to run arrow? After this you can merge back the genome fasta.

ADD COMMENTlink written 8 months ago by harish240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1723 users visited in the last hour