How to filter/view reads mapped to multiple locations using samtools or pysam
2
3
Entering edit mode
7.1 years ago
AlchemistMoz ▴ 40

I've looked through the documentation for samtools and pysam but haven't found what I'm looking for. Pysam can check if read.is_secondary which returns true if the read is not primary alignment. Whilst in samtools, I think the corresponding flag is:

256 0x100 secondary alignment

But what I'm looking for is a way to view/filter reads that are mapped to multiple regions, is there a way to do this with samtools or pysam?

python samtools pysam • 14k views
ADD COMMENT
4
Entering edit mode
7.1 years ago
Tom_L ▴ 350

According to the SAM format specification, I believe you are looking for the NH tag.

NH: Number of reported alignments that contains the query in the current record

Page 6: http://chagall.med.cornell.edu/galaxy/references/SAM_BAM_Specification.pdf

To filter them:

samtools view -h myBAM.bam | grep -P "(NH:i:1|^@)" | samtools view -Sb - > myBAM_filtered.bam

This will basically only grep SAM header (@ lines given by the -h argument) and single mapping reads (NH:i:1). It will also generate a BAM file (-b argument) based on SAM input (-S argument).

Cheers.

ADD COMMENT
1
Entering edit mode

Note that this is an optional tag, so it may or may not be present, depending on the aligner and command. I suggest filtering reads based on the required field mapq; a read with mapq 3 or lower is not uniquely mapped, and a read with mapq >3 is uniquely mapped.

ADD REPLY
0
Entering edit mode

Hi Brian, i would like to extract only the reads with mapping score 0, but as far as i understand the parameter "-q" only filters out what is equal or smaller that the argument. But i would actually isolate and study on only mapq 0 reads. Can you help me with that?

ADD REPLY
0
Entering edit mode

If you have bamtools,

bamtools filter -in pre-filtered.bam -out post-filtered.bam -mapQuality "0"
ADD REPLY
0
Entering edit mode

To support use of MAPq, see reference here: https://monashdatafluency.github.io/rnaseq-intro/qc.html

ADD REPLY
0
Entering edit mode

With your command,there may be some multi alignments containing such as NH:i:12 in NH tag.Maybe the pattern "(NH:i:1\b|^@)" is better.

ADD REPLY
3
Entering edit mode
4.0 years ago

If you have bamtools, you can do this in a little more human-friendly way:

bamtools filter -in pre-filtered.bam -out post-filtered.bam -tag "NH:>1"

Above will create a new bam file called post-filtered.bam. If you want to test by outputting the first 300 lines to stdout (instead of writing to a file):

bamtools filter -in pre-filtered.bam -tag "NH:>1" | bamtools convert -format sam | head -n 300
ADD COMMENT

Login before adding your answer.

Traffic: 2785 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6