Question: Chromosome X GWAS in related individuals
gravatar for alesssia
10 months ago by
London, UK
alesssia560 wrote:

Dear all,

I would like to include the X Chromosome in a GWAS of a continuous trait which has been measured in a dataset of related individuals (same-sex twins, ~22% males, 78% females).

I had a look at the literature, but most of the approaches presented seems to focus on case-control studies rather than on continuous traits. While some studies use PLINK, I am not sure it models X-inactivation correctly. Moreover, PLINK does not handle the presence of related individuals. Usually, we take inter-individual kinship into account by using LMMs, as implemented in GEMMA. However, I do not understand whether X-inactivation is or can be modelled in GEMMA.

Does anyone has any suggestion, or knows of any paper dealing with this?

Thank you very much in advance!

ADD COMMENTlink modified 10 months ago by Kevin Blighe54k • written 10 months ago by alesssia560

PLINK's X-inactivation model is controlled by the --xchr-model flag. "--xchr-model 2" treats male genotypes as homozygous females, and is the default in PLINK 2.0.

With that said, PLINK doesn't currently have a LMM implementation.

ADD REPLYlink written 10 months ago by chrchang5236.5k
gravatar for Kevin Blighe
10 months ago by
Kevin Blighe54k
Kevin Blighe54k wrote:

Hey alessia, I am neither aware of any program that does this. I have heard that Clayton's method is used for modeling X-inactivation, but I am not sure in which programs Clayton's method is implemented. I am neither sure that PLINK can do what you need.

We had a somewhat similar issue in the past where we wanted to run GWAS over a large trios dataset and use interaction terms - there was nothing out there that could really do this. We eventually decided to do it ourselves in R, and I developed some parallelised code that could do the entire GWAS in a few days using ~64 CPU cores. This eventually became a Bioconductor package:

We exported data from PLINK and alleles were encoded as 1,2,3,4. I believe we used MAF tallies on continuous scale, and had other covariates like smoking status, cockroach exposure, and PC1/2 to control for population stratification. To control for family, we used conditional logistic regression with family as the matching stratum. So, model was something like:

clogit(Outcome ~ SNP:SmokingStatus + PC1 + PC2 + strata(family)

Not a direct answer, but my answer, nevertheless.


ADD COMMENTlink modified 10 months ago • written 10 months ago by Kevin Blighe54k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1150 users visited in the last hour