Bioawk: multiple criteria
2
0
Entering edit mode
9.4 years ago
bongbang ▴ 80

From the example:

awk -c sam 'and($flag,4)' aln.sam.gz

How can I add and($flag,64) as another criterion? None of the various syntaxes I tried works.

Thanks

bioawk • 2.0k views
ADD COMMENT
0
Entering edit mode
9.4 years ago

Simply add the numbers and use it. In your case use 64+4=68. The command you mentioned will extract unmapped reads ($flag,4) and if you use ($flag,68), then it will extract unmapped reads that are first in pair. I have never used bioawk for this purpose but this is my guess.

ADD COMMENT
0
Entering edit mode

That really should work, but it doesn't.

ADD REPLY
0
Entering edit mode

It would be really surprising if this did not work. Then why would the original and($flag, 4) work?

I ran a test and the results look as expected:

$ samtools view -c -F 68 bam/SRR1553595.bam 
1027
$ samtools view bam/SRR1553595.bam | bioawk -c sam ' !and($flag, 68)' | wc -l
1027

Edit: actually I get a different result in this case:

$ samtools view -c -f 68 bam/SRR1553595.bam 
19083
$ samtools view bam/SRR1553595.bam | bioawk -c sam ' and($flag, 68)' | wc -l
39050
ADD REPLY
0
Entering edit mode
9.4 years ago

This should work as well:

bioawk -c sam 'and($flag, 4) && and($flag, 64)'
ADD COMMENT
0
Entering edit mode

That does work (Thank you!), but see my answer for a more idiomatic syntax.

ADD REPLY

Login before adding your answer.

Traffic: 3231 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6