Question: How to input covariates in GEMMA?
2
9 months ago by
maya123z30
maya123z30 wrote:

I'm new to GWAS and I've been trying to perform my analysis based on what's described in this paper, since the nature of my data is similar to theirs. So far I have cleaned my genotype data and then used GCTA to derive the top five principle components. Now I'm trying to use GEMMA to fit a linear mixed model, with the five principle components included as covariates.

The covariate file is where I'm stuck. The GEMMA manual provides an example on page 14 for five individuals with three covariates. It looks like this:

``````1  1  -1.5
1  2  0.3
1  2  0.6
1  1  -0.8
1  1  2.0
``````

However I'm confused as to what the numbers in this example actually mean and how I can derive them. The manual says that the first column of 1's indicates that the intercept should be included, but what do the other two columns mean? The output from GCTA gave me the top five principle components as an "eigenvector" file and an "eigenvalue" file. Which of these would I use to generate the covariate file for GEMMA and how would I go about doing this?

Edit: I noticed in the manual that you can include eigen value/vector files instead of a relatedness matrix. Is this what they mean by including the top pc's as covariates?

gemma gcta pca gwas • 573 views
modified 9 months ago • written 9 months ago by maya123z30
1
9 months ago by
maya123z30
maya123z30 wrote:

I ended up contacting the GEMMA email list directly, so I figured I'd answer my own question in case anyone else runs into this problem down the road. The answer is that from the eigenvector file that GCTA outputs, you'll first need to remove columns 1-2 (containing individual/family ID's) and then add a new column 1 containing only a string of 1's. This makes it compatible with GEMMA. Then save as a .txt file and input as your covariates file using the -c option. Hope this is helpful to others!

If thus, however,I was wondering how the sample ID of your eigenvector file match the downstream analysis of GEMMA, in another word, how GEMMA recognizes the order as the sample-wise relateness. I ask partly due to lack of deep insights into the mechanism of internal implementation of GEMMA, Thanks!