Question: samstat and samtools differ in the number of MAPQ=0 reads/alignments.
1
gravatar for user230613
3.8 years ago by
user230613280
Europe
user230613280 wrote:

I'm trying to retrieve the number of reads and the number of alignments with MAPQ=0. To address the question I'm using samtools in the following way:

samtools view -@ 4 BAM | awk '{if($5==0) {print $0}}' > mapq0.sam

cut -f1 mapq0.sam | sort -u | wc -l

1606759 # Number of reads

wc -l mapq0.sam

2952315 #Number of alignments

However, I'm getting different numbers when running Samstat:

MAPQ < 3 30814.0

I think I'm missing something very basic. Moreover, I'd expect to have a greater number in Samstat MAPQ<3 since obviously it also considers, MAPQ=1,2...

Any hint?

Thank you

bwa samtools samstat • 1.4k views
ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by user230613280
0
gravatar for James Ashmore
3.8 years ago by
James Ashmore3.0k
UK/Edinburgh/MRC Centre for Regenerative Medicine
James Ashmore3.0k wrote:

Maybe try using bioawk:

   samtools view input.bam | bioawk -c sam '$mapq == 0 {sum += 1} END {print sum}'
ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by James Ashmore3.0k

Thank you for your answer, but actually I'm looking for an explanation in the difference between samstat and samtools counts :).

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by user230613280

MAPQ=0 indicates multi-mappers. Have you considered/accounted for those when comparing those two numbers?

ADD REPLYlink written 3.8 years ago by GenoMax94k

Yes, I have consider that, that's why I'm counting read names using cut -f1... command

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by user230613280

Please could you move your answer to a comment? I want to keep the question opened and unanswered :)

ADD REPLYlink written 3.8 years ago by user230613280

Question never really closes. Even for questions where an answer has been "accepted" one can add new answers.

If you are referring this samstat program then I would be scpetical. It seems to not have been updated in 1.5+ years. If you need an alternate give Qualimap or Bamstat a try.

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by GenoMax94k

Yes, I'm referring to that samstat, and although it has not been updated in 1.5 years, we are not talking about a very up to date (changing ) issue, is only counting the number of reads with MAPQ=0. Is not a big deal :/

I've used qualimap, but as I said, I'm not looking for other tools to do the same, I'm looking for an explanation of the difference between samstat and samtools.

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by user230613280
1

There seems to be some sort of a bug in samstat. I tried your method above on a small bam file and got different counts with samstats. You could report this to the developer.

ADD REPLYlink written 3.8 years ago by GenoMax94k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2379 users visited in the last hour
_