Question: edgeR DE; choosing right coefficient from design matrix with many covariates
0
7 months ago by
Cece0
United States/Houston
Cece0 wrote:

Hi,

I'm trying to run a DE analysis in edgeR and I'm getting confused trying to choose the right coefficient from the model matrix. Group is a factor made up of cases and controls (disease analysis), Gender is a factor (Male and Female), and Age and Cov1 are continuous. My design looks like:

design <- model.matrix(~ Age + Gender + Cov1 + group + group*Cov1, data=covmerge)

covmerge is a file listing my covariates in columns by each sampleID (rows):

``````SampleID Cov1       Cov2        Cov3        Cov4       Age  Gender
sam1    0.201249    0.065659    0.101665    0.188839    22  Female
sam2    0.271327    0.000000    0.230564    0.125517    16  Male
sam3    0.000000    0.053912    0.037979    0.119295    20  Male
sam4    0.246016    0.000000    0.216258    0.201482    14  Male
sam5    0.421702    0.000000    0.279747    0.154855    15  Male
sam6    0.256205    0.000000    0.201658    0.268092    13  Male
``````

and group is part of y\$samples\$group so that levels(y\$samples\$group) "control" "case"

I set my reference as controls: y\$samples\$group <- relevel(y\$samples\$group, ref = "control") but this doesn't appear to be under my control in the sense that no matter whether I set a reference or not, my design coefficients only reflects either groupcase or groupcontrol but not both. This is irrespective of whether I include the intercept.

colnames(design)

[1] "(Intercept)" "Age" "GenderMale"
[4] "Cov1 " "groupcase" "Cov1 :groupcase"

I normalized y and estimated dispersions then carried out my lrt. I wanted to run 2 analyses separately from this design:

1. Simply comparing cases and controls, which I thought would be lrt <- glmLRT(fit, coef = 5), but this must be wrong because my output says I only get 15 significantly up and down-regulated from 12 000 genes. I ran this same scenario using exactTest with only group in the design to familiarize myself with edgeR before moving to a more complex model. The exactTest output gives me 100s of significant genes so what am I doing wrong here? I thought coef=5 would test groupcase against the controls which have been thrown into the intercept as baseline. Am I wrong?

2. I wanted to test the interaction of group as a whole (both cases and controls) with Cov1 and contrast this with no interaction. Is this possible to do using the model I've defined above? I thought to start by testing interaction of groupcase with Cov1 vs. groupcase alone, since I can't see controls. lrt <- glmLRT(fit, contrast=c(0,0,0,0,-1,1)), but my output is 1 up-reg gene. I'm clearly doing something wrong here since my exactTest setup worked, and I can only think that perhaps I don't understand how to choose the right contrast/ coefficient for the analysis I'm trying to perform.

Any help would be greatly appreciated. I hope I've laid out the problem clearly enough for the community to understand what my aim is. Thanks in advance.

edger rna-seq • 330 views