Question: Understanding How Snps Affect The Association Score In A Gwas
gravatar for User 6659
8.4 years ago by
User 6659960
User 6659960 wrote:


Please excuse the basic question but I am new to GWAS. It is my understanding that haplotypes are blocks of DNA sequence where the bases therein are always coinherited because (as yet) the blocks of DNA do not undergo recombination. Haplotypes can be characterised by key tagSNPs

In a GWAS you find haplotypes or tagSNPs associated statistically with a particular trait. As a general point, how does the distance between a causal variant of the trait and the tagSNP affect the resulting association? I thought that, by definition of a haplotype, all of the bases in the region are ALWAYS coinherited with the tagSNP (otherwise it wouldn't be a haplotype) so the distance of the causal SNP from the tagSNP does not affect the strength of the association. In other words, excluding other confounding factors like environment etc, would all SNPs inside a haplotype have the same signal strength if they had the same association with the disease?

Are the signals from SNPs additive? Lets say a haplotype had 2 SNPs associated with a disease, would their signals 'add up' to give a stronger signal?

Is it possible that SNPs outside the haplotype may sometimes segregate with the haplotype and cause weak signals for the haplotype?


gwas • 3.8k views
ADD COMMENTlink modified 4.7 years ago by Biostar ♦♦ 20 • written 8.4 years ago by User 6659960
gravatar for Jarretinha
8.4 years ago by
São Paulo, Brazil
Jarretinha3.3k wrote:

Your last point is the most interesting. But, first things first. The definition of haplotype depends on the context. You cited the classical sense. In the HapMap sense, only the statistical associations counts to define it. And in both cases recombination isn't excluded. It's just low enough to keep the signal detectable. Haplotypes come and go in evolutionary time.

SNPs signals aren't additive in a strict sense. They are affected by penetrance, dominance, epistasis in non-trivial ways rendering very complicated to build a metric.

You can go to you last point, now. We don´t know for sure what maintains a haplotype structured in the long run. But we know that they aren't isolated from the influence of other parts of the genome. It might be possible that a haplotype has emerge in response to forces like genetic draft, Hill-Robertson effect and linkage disequilibria in general. A haplotype isn't a uniform block of SNPs under exactly the same instensity of evolutionary forces.

So, it's totally possible to exist high orders of organization. That is, sets of haplotypes found together more commonly than others and so on. That's why is so hard to detect weak/rare associations. You'll need suficient statistical power to sort the different populational effects.

I'll add some refs soon to help in practice.

ADD COMMENTlink written 8.4 years ago by Jarretinha3.3k

thanks for the answer. It's interesting that my 'classical' definition isn't the hapmap definition as i got my 'classical' definition from the hapmap website! I'm not contradicting you - just explaining why its easy to be confused.

ADD REPLYlink written 8.4 years ago by User 6659960

so - to clarify - i appreciate that SNPs are strictly additive but are they 'loosely' additive. If a haplotpe has 2 SNPs associated with the disease, will there be a stronger signal for that haplotype than for the 'same' haplotype where one of those SNPs wasn't present?

ADD REPLYlink written 8.4 years ago by User 6659960

It's possible to imagine a situation where SNPs are totally additive. What I've said was that such situation is very unusual. It's not easy to find SNPs with the same effect on a phenotype. An extreme example: SNPs in hemoglobin loci (no recombination involved) can "cause" falciform anaemia, thalassemia or HPFH. Are they additive? I don't think so. Their effects simply don't stack up. You can find real examples for QTLs.

ADD REPLYlink written 8.4 years ago by Jarretinha3.3k
gravatar for S B
8.1 years ago by
S B10
S B10 wrote:

i agree it is easy to get confused...

  • LD blocks = stretches of genome in high LD, are created due to population genetics events and their patterns are different in different populations.
  • A haplotype may or may not refer to SNPs in an LD block.
  • Tag SNP-s refer to SNPs that are good surrogate for others in an LD block, by capturing the common haplotypes within that LD blocks. They are generally not perfect surrogates. Setting an LD threshold of >0.8 may give 5 tag SNPs, >0.2 might give 1 tag SNP.
  • Back to haplotypes, Technically any set of adjacent SNPs can consititute a haplotype, where adjacent generally means among the typed SNPs. So these SNPs could be quite far apart if SNP density is low. It is also possible to define non-adjancent SNPs as haplotype, but generally NOT in separate chromosomes. For a single meiosis, one can conceptually think of an entire stretch of chromosome being transmitted together (sadwiched between two recombination events), which leads to the confusion with the LD block. For different people (meioses) recombination will happen at different places, so there is no single haplotype that is "always" coinherited with a single SNP at a population level.
  • finally regarding addiditivity of association, say a 3 SNP haplotype. Each haplotype e.g. (1, 1, 0) can be thought of as a bin in the 2 X 2 X 2 table of these 3 SNPs (A/a X B/b X C/c). So modelling effects of each haplotype separately is equivalent to a saturated model (no linearity or additivity assumption, interactions of all orders are allowed). But power is quickly lost as bins become too sparse for longer haplotypes. So each haplotype is compared with the rest (clubbed together). So generally the underlying assumption is that a particular (rare) haplotype say (0,0,1) will tag an ungenotyped rare SNP in that region with high LD.
ADD COMMENTlink modified 8.1 years ago • written 8.1 years ago by S B10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2122 users visited in the last hour