Question: logistic regression using HLA alllelic data
6 months ago by
monalc4030 wrote:

I have a case-control dataset and I want to perform logistic regression and conditional logistic regression based on HLA multi-allelic data, using r. I want to find the effect on specific alleles on a trait. How do I do this what is the format. Most examples are based on SNP biallelic data. For instance at HLA-A I may have up to 30 unique alleles, at HLA-B it could be 50. Should I recode all the alleles and perform logistic regression on genotype pairs?

ADD COMMENTlink modified 6 months ago by Lemire570 • written 6 months ago by monalc4030

If you are merely asking this as a technical question, then you can do this in R via glm(). Your SNP predictors can be encoded categorically for as AA, AB, BB, or continuously as minor allele counts.


ADD REPLYlink written 6 months ago by Kevin Blighe60k
6 months ago by
Lemire570 wrote:

Find a way to produce a data frame containing the counts of each alleles that you see, and the case-controls status. E.g. (fake data)

> df

 DX DRB1.0401 DRB1.0404 DRB1.0405 DRB1.0408
1  0         0         0         1         1
2  0         0         0         0         2
3  0         0         0         0         2
4  1         1         0         0         1
5  1         1         0         0         1
6  1         0         1         0         1

If you are interested on the effect of a specific allele, then you can do, e.g.

summary(glm( DX ~ DRB1.0401 , family="binomial", data=df ) )

If you are interested in the effect of your HLA locus as a whole, then you can do, e.g.

full<- glm( DX ~ DRB1.0401 + DRB1.0404 + DRB1.0405 + DRB1.0408, 
   family="binomial", data=df ) 
null<- glm( DX ~ 1 , family="binomial", data=df ) 

anova( null, full , test="Chisq")

adding covariates to the models if deemed necessary.

ADD COMMENTlink written 6 months ago by Lemire570

The problem has been solved, thanks

ADD REPLYlink written 4 months ago by monalc4030
