I have a probably trivial question but, as I couldn't find any relatable/useful solution online, I wonder if someone could help. Basically, I need to plot a Venn Diagram for the intersection of three VCF files done using
truvari is a structural variants evaluation tool for benchmarking a call set, generated using different approaches, against a truth. In my case, I'm working with HG002 and three Papuan individuals for which I produced call sets using different approaches:
manta short reads-linear reference, and two pangenome approaches that use
PanGenie for the downstream genome inference.
Said that, what would be interesting is to know what are the SVs in common between callers, but most importantly where the graph approaches do better than the linear reference for all samples — which I already assessed in terms of metrics.
truvari collapse I produced my merged.vcf for each sample according to the three different call sets; however, the question is how I get the VCF in a format so that it can be used in R (possibly) to generate a Venn Diagram?
In other words, what I have to do or what information I need to extract from the VCF to produce the plot I want to show in R? If someone has already done something similar, any help is appreciated. Thanks in advance!