Question: Looking for software to run a multivariate linear mixed model GWAS on a Principal Component Analysis with a bimodal distribution
gravatar for r.harrington
14 months ago by
r.harrington30 wrote:

Hey guys,

I'm having issues trying to run a multivariate linear mixed model (lmm) Genome Wide Association Study (GWAS).

My input files & genotyping files are fine, but when I tried to run the GWAS in Gemma, the program stalled.

I've no issue running GWAS on individual phenotypes, but I've been told that a multivariate analysis using Gemma won't work; because my input phenotype values are biomodal & apparently Gemma can only run multivariate GWAS on data with a normal distribution.

Here is the input I used on Gemma:

gemma -bfile multivariate_genotypes -k output/kinship.cXX.txt -n 2 3 -lmm -o multivate_PC1+PC2_lmm_20JUN2017

Please let me know if I'm using Gemma incorrectly or if you can suggest more appropriate software.



snp pca gwas • 715 views
ADD COMMENTlink modified 3 months ago by sa.kornilov0 • written 14 months ago by r.harrington30
gravatar for sa.kornilov
3 months ago by
United States / University of Houston / Houston, TX
sa.kornilov0 wrote:

It is not as much GEMMA but possibly the linear model itself. GEMMA's manual (4.2.3?) has a reference to the linear model's robustness to model misspecification but also to the ability to model binary phenotypes and the authors provide a further explanation in

Xiang Zhou, Peter Carbonetto, and Matthew Stephens. Polygenic modelling with Bayesian sparse linear mixed models. PLoS Genetics, 9:e1003264, 2013.

I am not sure how robust the multivariate version would be to model misspecification - even if you analyzed your data in a way that would represent the bimodal distribution as a categorical (using data-driven approach - maybe even latent cluster/profile modeling - that would also answer questions about the multivariate dimensionality of your data?) /dichotomous DV value. However, that is, partly, an empirical question - I would run both analyses using the linear model. Why is the mixed aspect of it of particular interest as opposed, to, say, using categorical regression with PC's as covariates? Due to relatedness?

Anyway, do share your results! (I realize this question is 11 months old).

ADD COMMENTlink written 3 months ago by sa.kornilov0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1606 users visited in the last hour