Question: bamtools filtering syntax - reads containing more than X mismatches and soft-clipped reads
gravatar for joelepaul
4 weeks ago by
joelepaul0 wrote:

Hi @ll!

I'm extremely new to the field of bioinformatics, though I have some programming knowledge. A lot of tools have quite nice manuals out there, which helps me learning about how to use them properly. However, when it came to bamtools, I have been stuck.

My aims are twofold - starting from nice *.bam datasets

1) to remove reads containing more than 4 mismatches. However, I can only find the syntax to filter reads that contain 0 mismatches.

bamtools filter -tag XM:0 -in reads.bam -out reads.noMismatch.bam

However, I cannot find any documentation about the syntax of the BWA-specific flags such as XM, so I do not know how to adapt this code correctly to include a "more than" function.

2) to remove soft-clipped reads. My supervisor told me to use bamtools for that, but all guides like these Remove Soft Clipped Bases refer to other software/scripts. The bamtools manuals also do not say anything about how to do that. Does anyone know how to use specifically bamtools for that purpose?

Thank you for your time!

Cheers Joe

bamtools bwa • 59 views
ADD COMMENTlink modified 4 weeks ago by Pierre Lindenbaum127k • written 4 weeks ago by joelepaul0
gravatar for Pierre Lindenbaum
4 weeks ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum127k wrote:

not bamstools, using samjdk

java -jar dist/samjdk.jar -e 'Integer xm = record.getIntegerAttribute("XM"); if(xm!=null && xm > 4) return false; return record.getCigar()==null   || !record.getCigar().isClipped();' input.bam
ADD COMMENTlink written 4 weeks ago by Pierre Lindenbaum127k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1906 users visited in the last hour