Question: difference between mapq=0 and samtools flag 0x900
gravatar for bitpir
2.1 years ago by
bitpir180 wrote:


I am comparing the # of uniquely mapped reads from the same fastq (illumina paired end) to 2 reference sequences using BWA-MEM. I expect mapping to one of the reference sequences (let's call it ref A) to be have many multi-mappers and mapping the other (ref B) to be more unique. Ref A has many duplicate sequences.

I'm using the same bwa-mem parameter for alignment:

bwa mem -M -t 16 -B 1 -T 40

I calculated uniquely mapped reads using samtools view:

unique_alignment=$(samtools view -c -F 0x904) # for excluding secondary, supplementary, unmapped reads
mapq30_unique=$(samtools view -c -q 10 -F 0x904)

The results are below:

    # mapped reads  # uniquely mapped reads % of total seq  # of unmapped   # of MAPQ=10    % MAPQ=10
refA    15758104    15008036    95.24011264 0   10384098    65.7639396211625
refB    16352354    15632108    95.595459834101 0   14264615    87.9141156068417

As you can see, there isn't any difference in % of uniquely mapped reads (~95% for both) but a difference is seen in MAPQ=10.

From what I read, MAPQ=0 means reads are mapped to many places. I was wondering why using samtools view alone fail to capture uniquely mapped reads, and additional MAPQ filtering is necessary to get the unique mapped reads?

Thanks for the help!

mapq samtools • 1.2k views
ADD COMMENTlink modified 2.1 years ago by Devon Ryan96k • written 2.1 years ago by bitpir180
gravatar for Devon Ryan
2.1 years ago by
Devon Ryan96k
Freiburg, Germany
Devon Ryan96k wrote:

All mapped reads have a primary alignment, which is what you end up counting with -F 0x904. You're not just excluding "multimappers" (that's not a terribly useful term) with the addition of -q 10, but also more generically uncertain alignments.

If you really want to see how well your dataset aligns to the two genomes then you need to instead concatenate the genomes together and align against the resulting merged genome. Then split the results by genome.

ADD COMMENTlink written 2.1 years ago by Devon Ryan96k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 807 users visited in the last hour