How to get reads, which I filtrate
2
1
Entering edit mode
9.9 years ago
filipzembol ▴ 180

Hi,

I have one question, I need to get reads after filtration. Than normally I get health reads (health = reads after filtration, satisfy the conditions), but I need to get reads, which I remove.

For example I set filtration to get reads with mapQ bigger than 10, I used samtools view -bg 10 ... and I get reads with mapQ bigger than 10, but I want reads with mapq lower than ten.

Could you please help if it is possible. And other option, compare the raw bam file and filtrate bam file and get the difference between them.

Could you help me pls or do you know some tool to do this?

bam filtration Samtools • 2.2k views
ADD COMMENT
3
Entering edit mode
9.9 years ago
alistairnward ▴ 210

You can use bamtools to perform all sorts of BAM filtering tasks. Try the software at gkno.me (click on the 'Try it' button for simple instructions on getting the software) and you can get a while resource for next-generation sequencing analysis. To get the reads with mapQ less than 10, the following command line will work:

gkno bamtools-filter --in <BAM file> -mq "<10"

You can change the conditional command to easily get anything that you need. There are lots of additional filtering tasks (e.g. pulling out paired end reads, duplicate reads etc) that can easily be performed with gkno and bamtools.

As to your second question, what comparisons do you want to do? One easy thing to do, is apply the inverse of the above command:

gkno bamtools-filter --in <BAM file> -mq ">=10"

and you'll have all the reads not present in your filtered bam. You can get summary bam statistics using:

gkno bamtools-stats --in <BAM>

and compare the results of the raw, and the two subsetted bam files.

ADD COMMENT
2
Entering edit mode
9.9 years ago

Just use awk or another scripting language, for example samtools view foo.bam | awk '{if($5<10) print $0}'. You can make that a bit more complicated if you want to keep the header (useful for piping back into samtools for conversion to a BAM file), but that's enough to give you the idea.

BTW, pysam is useful in general for more complicated tasks.

ADD COMMENT
0
Entering edit mode

Oh, I finished it by awk, same like you wrote, sorry for this stupid question.

ADD REPLY

Login before adding your answer.

Traffic: 1568 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6