Bam file: replace low quality bases with "N"
1
0
Entering edit mode
2.5 years ago
Jautis ▴ 530

Hello, is there any way to remove low quality score bases from bam files, leaving the space blank or replaced with a filler character (i.e. Ns)?

I have mapped my sequencing reads, and adjusted the base quality scores (using mapDamage, which is conceptually similar to GATK's BQSR). I would now like to remove bases that have quality score less than 20 prior to visualizing the misincorporation rate, regardless of what the read's mapping score was (understanding that there can be poorly called bases on a confidently mapped reads). Is there a tool to do this? Thanks in advance!

quality base bam • 1.2k views
ADD COMMENT
2
Entering edit mode
2.5 years ago

using samjdk: http://lindenb.github.io/jvarkit/SamJdk.html

 java -jar dist/samjdk.jar -e "final byte[] quals = record.getBaseQualities();final byte[] bases = record.getReadBases();if(quals==SAMRecord.NULL_QUALS || bases==SAMRecord.NULL_SEQUENCE) return record;for(int i=0;i< quals.length;i++) {if (htsjdk.samtools.util.SequenceUtil.isNoCall(bases[i]) || (int)quals[i]  >= 20) continue;bases[i]=(byte)'N';}record.setReadBases(bases);return record;"  in.bam
ADD COMMENT
0
Entering edit mode

Awesome, works like a charm. Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6