Final Solution For "Mapq Should Be 0 For Unmapped Read."
6
6
Entering edit mode
10.1 years ago

I repeatedly come accross "MAPQ should be 0 for unmapped read." issue. Yes, I can set stringency as lenient in Picard and thus avoid evaluating this as an error, but still - I use another program and it is here again.

Is anybody aware of some tool for fixing this problem for bam files generated using BWA?

I have found things like CleanSam.jar, FixMateInformation and others, but none helped. I do not want to supress error, I would like to fix it:)

Thanks a lot.

bwa • 17k views
7
Entering edit mode
10.1 years ago
matted 7.6k

I put a comment in the other answer here. The issue is that there are unmapped reads (that is, with the unmapped read flag set) that have non-zero mapping qualities.

You could fix it by removing all unmapped reads, but that's heavy-handed and no good if you want to use them later for other analyses. It will fix the error, though. The command to do that is:

samtools view -bF 4 raw.bam > filtered.bam


The most correct way to fix it would be a short script that goes through the SAM/BAM and sets the MAPQ to 0 if the unmapped flag is set. I'll leave that as an exercise...

0
Entering edit mode

just to clarify this point, I see on the samtools docs that "-F INT Skip alignments with bits present in INT [0]", so if we use "-F 4" this will be removing the unmapped reads? also, I've always considered MAPQ=0 as mapping errors due to multiple possible genome locations. if so, would it be acceptable to use the following command to cover all possibilities?

samtools view -bF 4 -q 1 raw.bam > filtered.bam

0
Entering edit mode

Yes, that would do what you want (exclude multi-mappers and unmapped reads). Excluding unmapped reads is sort of a strange thing, in that most tools should do the right thing when you give them unmapped reads. But filtering them can help if you have strict tools like Picard that will complain otherwise.

To me, filtering multi-mapping reads is for another purpose (but equally valid and necessary in some situations). That is, specific downstream analyses, not just avoiding Picard complaints.

1
Entering edit mode

sure, that's the first step we do before starting the whole "best practices" downstream analysis suggested by the GATK team. in our alignment BAMs we weren't getting the unmapped reads, so that's why we never came across this "-F 4" option, but since our final goal is to detect variants we thing that this initial step to remove anything not useful (multi-mapping and unmapped reads fit that description in our opinion) does help to slightly reduce the computer resources needed (at least it reduces the size of the BAM file).

2
Entering edit mode
10.1 years ago
Marvin ▴ 870

Just ignore the warning. The actual fix would be to repair Picard, because that warning is simply wrong. Nothing in the SAM spec says that unmapped reads should have a MAPQ of 0. If a read maps unambigously to a nonsensical position, it should have non-vanishing MAPQ and the 'unmapped' flag set.

(On the other hand, I totally understand why Picard complains, it's perfectly reasonable. It still doesn't match the spec... therefore the SAM spec, especially the "best practices" section, is unreasonable. But that's a different question...)

1
Entering edit mode
10.1 years ago

MAPQ=0 reads can be removed by simply filtering them. for instance, you could use samtools to do that:

samtools view -hbq 1 raw.bam > filtered.bam


option 'h' includes header, option 'b' outputs binary, and option 'q 1' filters mapping qualities below 1

3
Entering edit mode

If you want to filter (possibly) offending reads, you could just filter unmapped reads with samtools view -bF 4 raw.bam > filtered.bam.

Also, -h doesn't matter when you're writing bam (-b); it will always put the header in the bam. It just matters for sam output.

0
Entering edit mode

+1 for letting me know that -h is not needed when binary format is on. thanks a lot for the correction. I'll write my coment directly on your question then.

1
Entering edit mode
10.1 years ago
toni ★ 2.2k

I chose to fix this myself. Just like you, I did not find any tool that fixes all the Picard complaints.

So what I did is to write a wrapper around the bwa sampe call and I intercept all the pairs coming out and fix flags and tags on the fly.

The only way you can know whether your read is unmapped or not is to check the corresponding flag bit. Once you caught the info, updates parts of the SAM alignment that need to be updated. And moreover, if you are dealing with paired-end experiments, as you are holding the alignment of the pair at the very same time, you can even update the mate information as well in the mate alignment & vice-versa.

Depending on the number of checks you want to do and your piece of code, it can increase (double sometimes in my case) the computation time, but at the end of the process, you have a clean file.

T.

1
Entering edit mode
7.1 years ago

I was in the same case with a bamset files mapped with bwa. Lot of "MAPQ should be 0 for unmapped read." warnings come during GATK process. I solve this problem by using picard CleanSam tool. As it is write on the CleanSam manual:

\$ picard-tools CleanSam
ERROR: Option 'INPUT' is required.

USAGE: CleanSam [options]

Documentation: http://picard.sourceforge.net/command-line-overview.shtml#CleanSam

Read SAM and perform various fix-ups.  Currently, the only fix-ups are 1: to soft-clip an alignment that hangs off the end of its reference sequence; and 2: to set MAPQ to 0 if a read is unmapped.
Version: 1.105()


I notice you have tried this solution but it not resolve your problem. This is curious, maybe a different version.

0
Entering edit mode

I had this problem 3 years ago so yes, I guess this functionality is now part of CleanSam as you say. Good to know.

1
Entering edit mode
6.6 years ago
Tabea ▴ 30

Since many people having this problem with "Picard" might land here from google, I am adding a solution that is working for me.

In Picard, you can now set the option VALIDATION_STRINGENCY=LENIENT, to tell Picard to be more forgiving about an invalid SAM file.

0
Entering edit mode

If you read carefully, this is already said in the first line of the question itself.