Genetic measures calculation in vcf files
Entering edit mode
7.1 years ago
ricardo ▴ 40


I have a set of files in vcf format. They were obtained by mapping against a reference genome. After mapping the reads and identifying the SNPs, a selection of these was made based on a list of 40 genes. This was done for data from 18 different genomes. This way I have 18 vcf files, with the variations found in these genes of interest.

I would like to know what tools I could use to calculate some measures of genetic variability like pi, TajimaD and others. It is my interest to compare these values, since the data comes from various locations around the world.

I've tried using vcftools, but the results were not consistent, or I got error in the values (nan).

vcf pi tajimaD • 2.2k views
Entering edit mode
7.0 years ago
willgilks ▴ 360

Hi Ricardo,

I think you're getting values of NaN because you are only analysing one individual. Population genetics requires more than one individual, of course.

Combine the separate vcf files from each individual into one, then analyse.

GATK is good for this function


Login before adding your answer.

Traffic: 2260 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6