Question: Exclude Reads With Xa Tag
gravatar for fo3c
8.0 years ago by
fo3c430 wrote:

Hello there,

Does anyone know of a tool to remove reads that have alternative alignments? Similarly to filtering for specific flags, I would like to filter out all reads that have more than one alignment.

I am aware that depending on the aligner the XA tag might not be set. I used bwa, which by default does not set an XA tag if there are more than 3 alternative alignments. However, this should not be a problem since multiple alignments would result in a lower mapping quality (and can thus be removed on this basis).

A simple grep would work, but that could leave me with broken flags (filter out one read in a pair and not the other without adjusting the flag).

Any suggestions?

bam samtools sam filtering bwa • 5.5k views
ADD COMMENTlink modified 8.0 years ago by Pierre Lindenbaum134k • written 8.0 years ago by fo3c430

I don't remember if BWA reports it, but the NH tag reports the total hits, so NH:i:1 means unique hits.

ADD REPLYlink written 8.0 years ago by JC12k
gravatar for matted
8.0 years ago by
Boston, United States
matted7.4k wrote:

It's a bit ugly as well, but for these types of tasks I usually identify the read names I want to filter via grep, then use grep again to remove both read pairs that match the name. This assumes that paired reads have the same name, which I think is usually the case.

For example:

samtools view in.bam | fgrep XA | cut -f 1 > bad_names.txt
samtools view -h in.bam | fgrep -vf bad_names.txt | samtools view -Sb - > out.bam

This would filter out a complete pair if either (or both) read has an XA tag, and leave the BAM consistent. If you'd prefer to keep a read whose mate has an XA tag but fix the flags, I think you'd have to do it yourself with something like pysam or Bio-Samtools (or just hack the flags and text on your own).

ADD COMMENTlink written 8.0 years ago by matted7.4k
gravatar for Pierre Lindenbaum
8.0 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum134k wrote:

The following java program will filter a BAM using javascript (rhino) and the picard library. It reads a BAM file/stdin and applies the javascript script to accept/reject each SamRecord. The samRecord is injected as "record" and the SamFileHeader is injected as "header" in the js context. To answer your question you can use the script:


The java program, how to compile and a test file :

ADD COMMENTlink modified 2.2 years ago • written 8.0 years ago by Pierre Lindenbaum134k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1502 users visited in the last hour