Question: Gwas: Matched Pairs Logistic Regression
1
gravatar for bdeonovic
7.4 years ago by
bdeonovic180
United States
bdeonovic180 wrote:

Logistic regression is a common analysis tool used for GWAS when your response variable of interested is qualitative. It comes as one of the standard tools in most GWAS packages (e.g. PLINK).

Most logistic regression models for GWAS would be setup as:

log(odds of disease) = \beta_0 + \beta_1*X

Where X is number of copies of the minor allele for a particular SNP of interest. However, suppose that my case-control data is matched (In my case matched by age, BMI, reported ethnicity, and distance to procurement site). I don't think standard logistic regression (as I have outlined above) is valid. What does everybody do? I don't see options for this in packages like PLINK.

gwas • 4.7k views
ADD COMMENTlink modified 7.4 years ago by Devon Ryan98k • written 7.4 years ago by bdeonovic180
1

By matching cases and controls by age, BMI, etc., we are just trying to reduce confounding factors. I think logistic regression is still valid. Also, plink can handle other variables as covariate.

ADD REPLYlink written 7.4 years ago by zx87549.9k

I know the logistic regression model can have other covariates. The logistic regression model requires the responses to be independent. If they are matched pairs the responses are not independent (If I know this 30 year old African-American with high BMI has the disease, it changes the probability that another 30 year old African-American with high BMI has the disease)

ADD REPLYlink written 7.4 years ago by bdeonovic180
1

How do you have matched people? Are they twins? I don't understand the point otherwise, since it seems that pairing them up would be restrictive (if you can't pair everyone) or inaccurate (you make some bad pairs). I think the more natural approach is to include the covariates (age, BMI, ethnicity, etc.) in the regression. Then you can use all your samples, without a possibly unnatural pairing. I guess this is basically what zx8754 said...

ADD REPLYlink written 7.4 years ago by matted7.3k
1

They are paired by age, sex, BMI, ethnicity. You are right the pairing is not perfect. I did not design the study...I'm just the poor sob who gets to analyze the data.

ADD REPLYlink written 7.4 years ago by bdeonovic180
1
gravatar for Devon Ryan
7.4 years ago by
Devon Ryan98k
Freiburg, Germany
Devon Ryan98k wrote:

Well, the logistic regression might be more generally thought of as:

log(odds of disease) = \beta_0 + \beta_1 * X + \beta_2 * BMI + \beta_3 * Ethnicity ...

There can also be interactions, of course. I've never needed to use plink, but its documentation suggests that it can handle this sort of model.

ADD COMMENTlink modified 7.4 years ago • written 7.4 years ago by Devon Ryan98k

I know the logistic regression model can have other covariates. The logistic regression model requires the responses to be independent. If they are matched pairs the responses are not independent (If I know this 30 year old African-American with high BMI has the disease, it changes the probability that another 30 year old African-American with high BMI has the disease)

ADD REPLYlink written 7.4 years ago by bdeonovic180
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2641 users visited in the last hour
_