Question: Why main diagonal of gemma relatedness matrix is not equal to 1
gravatar for Johan Zicola
11 months ago by
Johan Zicola50
Johan Zicola50 wrote:

I am running GWAS analyses with gemma that seem to work but when I open the centered relatedness matrix (.cXX), I do get a symmetric matrix but the main diagonal is not made of 1s ... why?

The manual p.11 part 3.3.1 shows an example of matrix with a main diagonal which is not made of 1s. I also tried to dig into the gemma code on Github but I did not find the piece of code used to calculate the relatedness matrix.

Commands and output

Plink and gemma commands to generate the matrix:

# Generate plink file from my VCF file (67 samples)
vcftools --gzvcf $vcf_file --plink --out $prefix_vcf
# Make binary plink files
plink --file $prefix_vcf --make-bed --out $prefix_vcf
# Generate centered relatedness matrix with gemma
gemma -bfile $prefix_vcf -gk 1 -o $prefix_vcf

Here the output log of gemma:

## GEMMA Version = 0.94
## Command Line Input = -bfile subset_67_accessions_wo_singletons_only_alt_allele -gk 1 -o subset_67_accessions_wo_singletons_only_alt_allele
## Summary Statistics:
## number of total individuals = 67
## number of analyzed individuals = 67
## number of covariates = 1
## number of phenotypes = 1
## number of total SNPs = 927314
## number of analyzed SNPs = 39678
## Computation Time:
## total computation time = 0.123833 min
## computation time break down:
##      time on calculating relatedness matrix = 0.00733333 min

I get as expected a square matrix of order 67 (as my number of samples in the VCF file). Here are the first 5 rows and columns of the matrix:

0.5265889259    0.0918289748    0.01315201643    -0.02417258371    -0.00300349980
0.0918289748    0.4426259465    0.01751500729    -0.0196104412    -0.04093555612
0.01315201643    0.01751500729    0.494663962    0.01773204554    0.03579284737
-0.02417258371    -0.0196104412    0.01773204554    0.5622735808    0.05660374681
-0.003003499807    -0.04093555612    0.03579284737    0.05660374681    0.6064086438

There is symmetry but not a main diagonal of 1s while same individuals should be identical to themselves, shouldn't they?

NB: I posted a similar post on the gemma-discussion google group but still did not get an answer after a month.

ADD COMMENTlink modified 10 weeks ago • written 11 months ago by Johan Zicola50

I have not used GEMMA; however, I don't recall other relatedness metrics (that I have used) equaling 1 for when the same person is being compared to themselves. I believe it's technically a limitless scale that can be both positive and negative, and which may be based on positions that are both genotyped in one individual in your dataset and those that are only genotyped in others. Probably better to hear from the developers of the program, as they may be the only ones who know how it was coded.

ADD REPLYlink written 11 months ago by Kevin Blighe46k

Thanks for your comment. Looking around, I could find that usually the main diagonal is made of 1 or values above 1 if consanguinity. The figure 1 of the Bae et al 2014 is illustrating well what I expect from a kinship matrix. Since individuals compared to themselves are based on the same SNP call datasets, it seems unlikely to me that the variations from 1 are due to missing data.

ADD REPLYlink modified 11 months ago • written 11 months ago by Johan Zicola50
gravatar for Johan Zicola
10 weeks ago by
Johan Zicola50
Johan Zicola50 wrote:

I've got the answer from the creator of GEMMA Xiang Zhou:

The centered relatedness matrix does not scale each column of the genotype matrix, so the main diagonal of the resulting relatedness matrix will not be close to 1s (usually around 0.3 depending on maf etc.). If you use the scaled relatedness matrix (.sXX), then the main diagonal should be made of values that are very close to 1s.

ADD COMMENTlink written 10 weeks ago by Johan Zicola50

Cool - thanks for coming back to post this.

ADD REPLYlink written 10 weeks ago by Kevin Blighe46k
gravatar for sgalla32
9 months ago by
sgalla3230 wrote:

The diagonals here remind me more of a kinship matrix (self-kinship = 0.5) than a relatedness matrix (self-relatedness = 1). I haven't used Gemma before, but other relatedness programmes (like the R-programme KGD) based on high-throughput sequencing data have given me self-relatedness values that are greater or less than 1, for the reasons that Kevin and Johan point out. You could always scale your entire matrix so that self-relatedness is 1 (which will scale all your other relatedness values).

ADD COMMENTlink written 9 months ago by sgalla3230

I thought the terms kinship matrix and relatedness matrix could be used interchangeably. Gemma documentation should be updated to use the accurate term. I understood that a value >0.5 (self-kinship) or >1 (self-relatedness) could be explained by consanguinity but I still don't understand why I have values <0.5 in the main diagonal of the matrix.

ADD REPLYlink modified 9 months ago • written 9 months ago by Johan Zicola50
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 831 users visited in the last hour