I'm calling variants using the IonTorrent TorrentSuite on DNA which has been sequenced from formalin-fixed paraffin-embedded tissue. This has a major issue in that without addition of uracil-N-glycosylase, some of the Ts in the original DNA are deaminated to Cs, which upon sequencing and calling variants can show up as mutations, either as T>C transitions or G>A (from the opposite strand, due to PCR in the library prep). I do not have any idea how long these samples were stored without UNG before sequencing.
TVC (Torrent Variant Caller) gives a deamination metric (essentially, sum of T>C and G>A variants over all variants called), and for our samples, the highest value seen is ~0.92. Naively postprocessing the variants show that for these samples, C>T/T>C transitions ( : https://ibb.co/KjPkkT3) overwhelm the remaining variants among my samples.
My question is this, given the IonTorrent variant calling pipeline (sequencing > BAM file > TVC > VCF file with deamination statistic), is there:
a) a way of correcting the output VCF, or b) a set of filters to use in bcftools,
to reduce this effect on the samples?
My use case is this: these are medical samples, which have been inspected by a pathologist (hence the FFPE treatment), and I want to determine which variants are predictive* of outcome, hence I have two potentially contradictory goals: reduce false positives and capture the rarer variants which may hold predictive power.