Comparing vcf's in different formats
6 days ago
blkr ▴ 10

Does anyone have experience with comparing two vcfs in different formats (CLC vs Sentione generated) for differences in indels/SNVs? Im unsure if i would just have to reorder the vcfs so they are identical to run them through a tool like vcfCompare or are there other considerations to be made? Thanks!

6 days ago
LChart ▴ 430

Very good question. The short answer is, pass both through vt normalize (https://genome.sph.umich.edu/wiki/Vt). The long answer is that often times it is possible to represent the same indel in many ways; and the particular representation you get has to do with (1) what events (and where) the aligner put into the reads and/or (2) the state of the local assembly De Bruijn graph prior to encountering the event. As such, though many tools attempt to "normalize" variants into a canonical representation, they have varying degrees of success.