Question

Cox Regression and Linkage Disequilibrium

0

Entering edit mode

5.7 years ago

krc3004 ▴ 20

I'm trying to figure out the best way to account for linkage disequilibrium in a cox regression, and would really appreciate your advice. I'm testing the effect of a particular allele on overall survival in a small group of patients (N = 100); this is a binary encoded variable for each patient (e.g. present or absent, 1 or 0).

I've found that this allele is also in linkage disequilibrium (LD) with two other alleles at the locus, so I've included two more binary variables for each patient for each of these other alleles. Each of these three alleles is significant in univariate survival analysis, so now I'd like to understand how to figure out which allele is driving the signal, since all three are in LD.

What would the best approach be to disentangle the effect of each of the 3 alleles? I've seen examples of stepwise cox regression used (i.e. https://cran.r-project.org/web/packages/My.stepwise/My.stepwise.pdf), but there also seems to be some literature suggesting that stepwise selection isn't appropriate (i.e. https://www.lexjansen.com/pnwsug/2008/DavidCassell-StoppingStepwise.pdf). Maybe a lasso cox would be one alternative?

As a simple workaround, I tried fitting a multivariable cox regression using each of the three alleles and interaction terms for all of them like so:

coxph(formula = Surv(OSmonths, OScensor) ~ allele1 + allele2 + allele3 + allele1:allele2 + allele2:allele3 + allele1:allele3 + allele1:allele2:allele3, data = testdat)

But when I do this, none of the covariates or their interactions are significant!

I'd really appreciate any tips- thanks!

survival cox linkage disequilbrium R SNP • 1.3k views

ADD COMMENT • link updated 5.7 years ago by Kevin Blighe 87k • written 5.7 years ago by krc3004 ▴ 20

score 2 · Answer 1 · 2018-09-03

Hey,

What you need (I believe) is GCTA's COJO (Conditional and Joint) analysis function, which, in a nutshell, explores independent SNP signals within the context of linkage disequilibrium (LD): http://cnsgenomics.com/software/gcta/

Keep in mind that the literature will always contradict itself, a problem ever more increasing now that there are bogus journals out there who have no interest in good science. That said, sometimes even the top journals publish erroneous findings.

Kevin