Question: SNPs, genotypes, HWE in GWAS,
gravatar for tonja.r
5.3 years ago by
tonja.r470 wrote:

I have some problems with understanding the SNPs and alleles involvement in HWE and GWAS.
For GWAS we have a pool of individuals from a normal population (NP) and people (PT) with a trait (disease of interest). We sequence them. In fact because human are diploid, we can have different alleles for one loci. One allele coming from a mother and second from a father. So, for instance, we will observe for a homozygous person reads only with G. Lets say the reference at this position has an A. It means, a person has a SNV (A->G). The genotype is (GG).
1.If only a minority of people has GG genotype at this position and genotypes AG, GG, AA are in HWE, then it is considered to be a SNP (A->G) and it will be added to dbSNP, right?
2.Does a  AG-genotype come from a heterozygous person? So, when I view some specific position on the genome and correspondingly, the sequenced reads, I might see either all A's or G's or both at this position. I am not quite sure I understand where an 'A' and 'G' from AG or AA or GG come from.
3.What happens if the minority of the people are heterozygous (AG) and all others are homozygous for reference (AA), it will not be considered as a SNP, right?


So, by sequencing NP and PT we identify novel and dbSNPs in the same way and test them for HWE. At the end we have SNPs that did not fail some test and then we compare what SNPs are highly represented in the PT-group in comparison to NP-group, right?

I hope I was able to explain where my problems regarding alleles, genotypes and HWE.


snp • 2.5k views
ADD COMMENTlink modified 5.3 years ago by Jautis290 • written 5.3 years ago by tonja.r470
gravatar for Jautis
5.3 years ago by
United States
Jautis290 wrote:

To start, dbSNP refers to a reference vcf file of SNPs, not the actualy snps themselves. It pretty much tells a genotyper where to look for variants. 


1) Whether an allele is more/less frequent doesn't influence the direction of a SNP in the dbSNP file. The allele found in the reference genome is the reference allele and the other allele is the alternate regardless of their frequencies in the population. 

2) Yes, a correctly called AG would be a heterozygous person. 

3) It depends what the cutoff minor allele frequency is. Usually, if the less frequent allele has a frequency of at least 1%, it is counted as a SNP. 


The GWAS you're describing would associate a genotype(s) with the phenotypic condition. I don't quite understand what you're doing with HWE, except if it's a filter you are using. 

ADD COMMENTlink written 5.3 years ago by Jautis290

I have read that by testing a SNP for HWE, it can encounter if a genotype was a sequencing error or if there are some genetic drift/selection in the population. So, I guess it is a filter, but it is used only in GWAS not when identifying a novel SNP, right?

ADD REPLYlink written 5.3 years ago by tonja.r470

Yes, I think that's correct. Your SNPs should be identified separately using a genotype caller (gATK has a good one). Then, you'll filter for common SNPs (usually those with allele frequency of at least 1%, but it will depend on your sample size) and for sites in HWE. 

ADD REPLYlink written 5.3 years ago by Jautis290
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 812 users visited in the last hour