Question: Help understanding the bam file generated from BLASR alignment
1
gravatar for JstRoRR
3.5 years ago by
JstRoRR60
Germany
JstRoRR60 wrote:

Hi, I used blasr aligner to align a pacbio dataset against a reference. I used *.bax.h5 files for the alignemnt. Since there were three *.bax.h5 file for a single sample I used the command as follows:

blasr m160128_145513_42215_c100901562550000001823208404291605_s1_p0.1.bax.h5 m160128_145513_42215_c100901562550000001823208404291605_s1_p0.1.bax.h5 m160128_145513_42215_c100901562550000001823208404291605_s1_p0.1.bax.h5 query.fasta -sam  -unaligned unaligned.reads -nproc 8 -out output.sam

I converted the sam to bam and sorted it. Now, when I check the bam file for the counts of read mapped/unmapped, I get something like this:

samtools view -c -F 4 output.sorted.bam 

12978

but when I use a different approach of counting reads I get 3 different values:

samtools view -F 4 output.sorted.bam | wc

12978  298494 9212184

I am not able to understand if these 3 values belong to individual *.bax.h5 I originally used in the mapping or this something else?

Or

is there any other way to use 3 bax.h5 file together in the same mapping step?

Thanks

mapping pacbio blasr bam • 1.5k views
ADD COMMENTlink modified 3.5 years ago by Devon Ryan91k • written 3.5 years ago by JstRoRR60
1

For the second part, you make a file with the list of files to align in it, with the suffix of the file .fofn .

-mark

ADD REPLYlink written 3.5 years ago by mchaisso160
3
gravatar for Devon Ryan
3.5 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

man wc

12978 is the number of lines and exactly what samtools view -c ... reported.

298494 is the number of words. This is useless.

9212184 is the number of characters. This is useless.

You probably wanted wc -l rather than just wc.

ADD COMMENTlink written 3.5 years ago by Devon Ryan91k

Ahh silly me. Thanks Devon.

For the second query, can you suggest if using .bax.h5 files this way is the correct way? I have checked the docs but couldn't find anything as how to use multi .bax.h5 files in mapping. The other way I think of is to map individual .bax.h5 files and merge the bam files together.

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by JstRoRR60
1

I unfortunately have no experience with pacbio .bax.h5 files, so I can't give a useful reply. If you create a new question with just that then I expect someone more familiar with pacbio will reply.

ADD REPLYlink written 3.5 years ago by Devon Ryan91k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1807 users visited in the last hour