**0**wrote:

I'm trying to figure out the best way to account for linkage disequilibrium in a cox regression, and would really appreciate your advice. I'm testing the effect of a particular allele on overall survival in a small group of patients (N = 100); this is a binary encoded variable for each patient (e.g. present or absent, 1 or 0).

I've found that this allele is also in linkage disequilibrium (LD) with two other alleles at the locus, so I've included two more binary variables for each patient for each of these other alleles. Each of these three alleles is significant in univariate survival analysis, so now I'd like to understand how to figure out which allele is driving the signal, since all three are in LD.

What would the best approach be to disentangle the effect of each of the 3 alleles? I've seen examples of stepwise cox regression used (i.e. https://cran.r-project.org/web/packages/My.stepwise/My.stepwise.pdf), but there also seems to be some literature suggesting that stepwise selection isn't appropriate (i.e. https://www.lexjansen.com/pnwsug/2008/DavidCassell-StoppingStepwise.pdf). Maybe a lasso cox would be one alternative?

As a simple workaround, I tried fitting a multivariable cox regression using each of the three alleles and interaction terms for all of them like so:

```
coxph(formula = Surv(OSmonths, OScensor) ~ allele1 + allele2 + allele3 + allele1:allele2 + allele2:allele3 + allele1:allele3 + allele1:allele2:allele3, data = testdat)
```

But when I do this, none of the covariates or their interactions are significant!

I'd really appreciate any tips- thanks!

**44k**• written 9 months ago by krc3004 •

**0**