Homopolymer sequencing error in IonProton reads
0
0
Entering edit mode
9.2 years ago

I am analyzing reads from a mouse genome sequenced using Ionproton machine. Around half of the reads aligned against the reference genome used either an insertion or a deletion (gap). This effect was clearly evident in VCF file (I used SAMtools mpileup and BCFtools) that contains lots of small indels (than expected) usually 1 bp indels which I am sure are due to the sequencing errors in the homopolymer regions. May be the approach used by the sequencer to quantify homonucleotides addition (based on peak of H+ ions release) doesn't have high resolution. I looked at it online and found others complaining about the same issue. I am wondering if somebody has analyzed genomic data from Ionproton thoroughly for for the purpose of identifying sequence variants and can elaborate on the best approach to reduce these false positive indels. I have got aligned BAM files from the machine and they were aligned using TMAP. I am planning to use GATK Recalibrator that may model these errors and reduce their base qualities but not sure how much it would help. Please let me know if somebody already has some experience with this.

Homopolymer Ionproton Indels • 3.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 2310 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6