Hello, I have a BAM file and I want to create another BAM file by filtering only reads that are 100% identical mapped. For example if my read length is 100 , I want to select CIGAR 100M and make a bam file. Could you please suggest how this can be done. Thanks Adrian
To check for a given length of total match - e.g. 100 - you can use perl and anchor the regular expression:
/^100M$/. This will exclude any other total match length (e.g.
125M) and any soft clipped read. You will have to re-header the file after filtering.
samtools view potexvirus2.bam \ | perl -lane 'print if $F =~ /^100M$/;'
using samjdk http://lindenb.github.io/jvarkit/SamJdk.html . Check the edit-distance (NM) exists and is equals to zero.
java -jar dist/samjdk.jar -e 'return !record.getReadUnmappedFlag() && record.getCigarString().equals("100M") && (record.getIntegerAttribute("NM")==null || record.getIntegerAttribute("NM").intValue()==0);' in.bam