I am trying to calculate dn/ds Tajima's D and other statistical test on VCF file of some genes obtained from Hapmap. Though I have been successful to get significant result in my initial studies but I am bit confused as the VCF file obtained from population in Hapmap may be from healthy ones too .
Mine question is , am I doing correct analysis as my only objective is to do statistical analysis of the genes and see whether they are showing positive selection or negative selection during evolution.
For evolutionary study , do we need data from diseased one or 1000 genome data will be only helpful
Thanking you in advance