Filter sam file to remove reads with large deletions
1
0
Entering edit mode
7.0 years ago

Hello,

I need to filter out reads from my sam files that contain deletions larger than 3bp. Does anyone know of a simple way to do that? I need to keep other low quality reads (like reads that contain small deletions, mismatches and insertions).

Thanks!

RNA-Seq • 2.0k views
ADD COMMENT
0
Entering edit mode

Python with pysam could probably do that, do you have any coding experience?

ADD REPLY
2
Entering edit mode
7.0 years ago

using samjs to remove the reads havig a cigar D/N larger than 3 pb.

java -jar dist/samjs.jar -e 'function accept(rec) {if(rec.getReadUnmappedFlag()) return true; var c=rec.getCigar(); if(c==null) return true; for(var i=0;i< c.numCigarElements();++i) {var ce=c.getCigarElement(i);if(ce.getLength()<3) continue;var s=ce.getOperator().name(); if( s=="D"  || s=="N") return false; } return true;}accept(record)' input.bam
ADD COMMENT
0
Entering edit mode

Neat tool Pierre, I want to do the opposite, I want to find reads in sam file that span large deletions (>1kb), so I thought I would use your -X option, to save discarded. Running into a few issues, samoutputformat is not considered valid and -X doesn't get recognized? https://pastebin.com/k2dVRxB5

ADD REPLY
0
Entering edit mode

from your stack trace, your version of the tool is just too old. But thanks anyway because I've found a bug related to this option -X that I don't use much. https://github.com/lindenb/jvarkit/commit/50ce395e37ba198aaff7d38d8b34475ed402965e

otherwise, you should ask this as a new question.

ADD REPLY

Login before adding your answer.

Traffic: 2455 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6