Hi all.

After performing a GWAS with the MLMM package (multi-locus mixed linear model, paper1) and obtaining a set of SNPs based significantly associated with the phenotype of interest I am having problems with the interpretation of the effect size of the SNPs. As input, the package requires a matrix of SNPs in 0, 1, and 2 format (plink .raw format). The model follows a stepwise regression process in which it iteratively includes and removes SNPs, finally returning the best model (based on a modified BIC criterion).

When exploring the output of the best model, I find the SNP name, the SNP effect estimate, the p-value, and the t-value among others. However, I fail to understand how to interpret the output. For example, if the estimated effect of SNP "X" is -40, which of the two alleles should I select to improve a quantitative phenotype? Could I take the significant SNPs from the model and run boxplots for each genotype to test this?

P.S. I have been following the code at paper2, where they also use the same method, however, they do not reflect the estimated effect of each allele on the phenotype.

Making a boxplots for the significant SNPs of interest will give you the idea on the mode of SNP effects on quantitative phenotypes (additive, dominant, recessive, or overdominant). Your Idea is good in fact. Additionally, you can run LM independently in R for your SNPs of interest to get more data on the effect (beta).