Question: is Hapmap or 1000 genome VCF data is from diseased or healthy one
gravatar for Being Bioinformatician
3.9 years ago by
Being Bioinformatician170 wrote:

Respected Member,

                              I am trying to calculate dn/ds Tajima's D and other statistical test  on VCF file of some genes obtained from Hapmap. Though I have been successful to get significant result  in my initial studies but I am bit confused as the VCF file obtained from population in Hapmap may be from healthy ones too .

                             Mine question is , am I doing correct analysis as my only objective is to do statistical analysis of the genes and see whether they are showing positive selection or negative selection during evolution.

                             For evolutionary study , do we need data from diseased one or 1000 genome data will be only helpful


Thanking you in advance


ADD COMMENTlink written 3.9 years ago by Being Bioinformatician170

please, define "healthy" :-)

ADD REPLYlink written 3.9 years ago by Pierre Lindenbaum122k

Sorry ,. by healthy I meant to say Control ... ;)

ADD REPLYlink written 3.9 years ago by Being Bioinformatician170

You should be aware there there are many "evolutionary studies".  In fact, Giovanni M Dall'Olio has done a really nice study on selection in the 1000 genomes data:

A Database Of Signatures Of Selection In The 1000 Genomes Dataset

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by Zev.Kronenberg11k

Thanks sir, For this valuable information , I too have came across two papers .

According to "1000 GENOMES: A World of Variation , which state"

“I think the real key . . . is being able to translate the gene activity into the operation of biological networks,” Hood  says. “What can be useful is to look at the genes that are present in the 1000 Genomes Project, the nature of the variation, and map them into key biological networks in cardiovascular disease, neurodegenerative disease, whatever you are interested in and see if there are candidates that stand out. Are there variants that might lead to interesting behaviors of those biological networks?”

According to "A map of human genome variation from  population-scale sequencing"

Although data from the 1000 Genomes Project pilots are neither fully comprehensive nor fully free of ascertainment bias (issues include low power for rare variants, noise in allele frequency estimates, some false positives, non-random data collection across samples, platforms and populations, and the use of imputed genotypes), they can be used to address key questions about the extent of differentiation among populations, the presence of highly differentiated variants and the ability to fine-map signals of local adaptation.

ADD REPLYlink written 3.9 years ago by Being Bioinformatician170
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1259 users visited in the last hour