Question: how to analyze plot-vcfstats
2.0 years ago
viswanathrana29 wrote:

hi all, I am identified the transition and transversion in plant genome and after that i plotted this data through plot-vcfstats in bcftools. then how can i analyze this grapghs.. thank you![this grapgh show i don't know about the number of sites so please help me to explain this graphs

written 2.0 years ago by viswanathrana29
2.0 years ago
William wrote:

For a complete genome or exome you expect a stable Ti/Tv ratio per species / sub species. This because there are biological factors at play that influence this number.

For instance for the human genome the ti/tv ratio is 2.1 for the genome and 2.8 for the exome.

False positive variants have a Ti/Tv ratio of 0.5. This because there a no biological factors at play, just sequencing noise, and there are twice as much possible Tv mutations(sequencing errrors) compared to Ti mutations(sequencing errrors).

So when you see a sharp drop of the Ti/Tv ratio towards 0.5 you start selecting for more false positive SNPs than true positive SNPs.

The plot you show indicates that your species probably has a Ti/Tv of 2.1.

It also shows that of the 55k SNPs sorted by variant quality descending, the last ca 5k SNPscause a large drop of the Ti/Tv. So these probably contain a lot of false positive variants not caused by biology but by sequencing noise.

written 2.0 years ago by William
