Question: linkage disequilibrium: difference between D' and r-squared
3
gravatar for wkreinen
3.6 years ago by
wkreinen50
Germany
wkreinen50 wrote:

Hello,

I have a question concerning the difference between the linkage disequilibrium measures D' and r-squared. I know the formal definitions. But I have problems understanding the different concepts behind D' and r-squared? And what does it mean if D' is low and r-squared is high (and vice versa).

 

Thanks wim

r-squared ld • 22k views
ADD COMMENTlink modified 3.6 years ago by Felix Francis450 • written 3.6 years ago by wkreinen50
2

Yes, sorry. It is not a classical bioinformatic question. Thanks for the wikipedia link. Yes, I know that stuff. Unfortunately I did not understand it and obviously I could not make it clear what I did not understand ...

In my understanding of bioinformatics it is not a fault if one tries to explain some basic conceptual differences that make a difference in the end of the day. My impression is that D' and r-squared are used (quite often) arbitrarily.

I try to be more detailed in asking my question ...

D' and r-squared are different (and popular besinde others) approaches  to normalise D. D' uses the theorectical maximum of D to do the normalisation, r-squared uses correlation coefficient. So far so good ... But I have problems to understand which conditions have an influence on choosing D' or r-squared as a parameter of ld.  What are the important criteria to choose D' or r-squared? Maybe, this is the wrong forum to ask a question like this. If this is the case I wish to apologise. Anyway, I would appreciate if somebody could give me a productive hint.

All the best

Wim

ADD REPLYlink written 3.6 years ago by wkreinen50

"In my understanding of bioinformatics it is not a fault if one tries to explain some basic conceptual differences that make a difference in the end of the day."

The problem is that the question isn't about bioinformatics at all. It's about population genetics. This group is for bioinformatics questions - issues specifically related to data handling, usage, and interpretation of biological datasets, not necessarily introductory population genetic theory outside the scope of a program. Also, you should consider posting a follow-up as a comment and not an answer to your question.

ADD REPLYlink written 3.6 years ago by Brice Sarver2.5k
9
gravatar for Felix Francis
3.6 years ago by
Felix Francis450
United States/University of Delaware
Felix Francis450 wrote:

D’ :  A scaled version of D 

  • Ranges between –1 and +1
  • ±1 implies at least one of the observed haplotypes was not observed
  • If allele frequencies are similar, high D’ means the markers are good surrogates for each other
  • D’ estimates inflated in small samples (cons)
  • D’ estimates inflated when one allele is rare (cons)

r2 : Ranges between 0 and 1. It is the measure preferred by population geneticists

  • 1 when the two markers provide identical information.
  • 0 when they are in perfect equilibrium.

 

 

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by Felix Francis450
0
gravatar for Brice Sarver
3.6 years ago by
Brice Sarver2.5k
United States
Brice Sarver2.5k wrote:

This isn't a bioinformatics question (and might be a homework question), but everything you want to know, including derivations and follow-through links for more in-depth explanation, is here.

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by Brice Sarver2.5k
0
gravatar for GabrielMontenegro
3.6 years ago by
United Kingdom
GabrielMontenegro420 wrote:

Yes, I don't think this is the right place to ask those questions. However here is my answer.

D' and r2 have a big difference in that a high value of D' does not mean that one locus can predict the other with high accuracy, which in the case of say imputing SNPs could be a major issue. On the other hand, an r2 of 1 implies perfect predictability; if we know the allele at one locus we can predict perfectly the allele at the second locus and vice-versa.

A new and very good book in my opinion is The Fundamentals of Modern Statistical Genetics from Laird and Lange. You can find more info there.

 

ADD COMMENTlink written 3.6 years ago by GabrielMontenegro420
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1761 users visited in the last hour