Why switch negative values in kinship matrix to zeros?
1
0
Entering edit mode
5.2 years ago
rimgubaev ▴ 330

I'm quite new to population genetics and recently looked at these papers and found that the negative values in kinship matrix are switched to zeros.

https://www.nature.com/articles/srep41561

The relative kinship matrix comparing all pairs of accessions was calculated using the software package SPAGeDi. Negative values between two individuals were set to 0

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5405135/pdf/fpls-08-00593.pdf

The relative kinship matrix (K matrix) was obtained using SPAGeDi , and negative values between two rapeseed accessions were set to 0.

both articles then cite: "A unified mixed-model method for association mapping that accounts for multiple levels of relatedness" by Yu who then cite "SPAGeDI: A versatile computer program to analyze spatial genetic structure at the individual or population levels" by Hardy.

This a bit confusing and I still can't find an explanation.

Is this is just a common convention? And in case if I use TASSEL to make kinship matrix by Centered IBS method which produces negative values, should I switch them to zeros too?

kinship population genetics TASSEL SPAGEDI • 3.1k views
1
Entering edit mode
5.2 years ago
Asaf 10k

See this excellent blog post: https://brainder.org/2015/07/29/understanding-the-kinship-matrix/

The values in the kinship matrix resemble the relatedness two individuals have or the influence we can expect from one individual to the other. Negative values mean that one will have an inverse genetic influence on another which is not biological, two individuals can either be related or unrelated but not inversely (?) related.

0
Entering edit mode

Thanks! But still it is not clear why kinship matrix produced by Tassel contains negative values as well as why this values not in range from 0 to 1 i.e. there are numbers that are actually larger than 1. If someone could explain It would be great. Unfortunately I didn't find the relevant explanation in Tassel's manual.

1
Entering edit mode

Done some more reading and felt like I should update my answer. I found this paragraph in Spagedi manual:

A kinship coefficient (F) is often defined as the probability of identity by descent of the gene copies compared (e.g. Ritland 1996) but estimators based on genetic markers actually estimate a “relative kinship”, that can be defined as ratios of differences of probabilities of identity in state (Rousset 2002; Vekemans & Hardy 2004). Thus, equating these kinship coefficients with probability of identity by descent is not true in general (Rousset 2002). In the case of two individuals i and j, the kinship coefficient between them can be defined as Fij=(QijQm)/(1-Qm), where Qij is the probability of identity in state for random gene copies from i and j, and Qm is the average probability of identity by state for gene copies coming from random individuals from the sample (i.e. “reference population” = sample). As defined here, kinship is not really a population genetics parameter as it depends on an arbitrary sample. Note also that with this definition, negative relative kinship coefficients naturally occur between some individuals, it simply means that these are less related than random individuals (a definition equating kinship and probability of identity by descent would not allow negative values).

So there are two terms, sometimes used without distinction. The procedure to set negative values as zeros was specifically written in Yu et al mixed model paper:

Negative values between individuals were set to 0, as this indicates that they are less related than random individuals

The developers of EMMA say it's a bad idea because the kinship matrix "may not be positive semidefinite and thus might not be a valid form of variance component"