Question

limma: extracting coefficients for complex models

0

Entering edit mode

3.2 years ago

fr ▴ 210

Hi,

I have an experiment with 3 factors: cell_type (cell1, cell2, ... cell10), time (t1, t2, ... t7) and treatment (control, treated). I'm modeling it as ~0 + time*treatment + cell_type, and my resulting design matrix has columns: control.t1, treated.t1, control.t2, ..., cell1, cell2. Note how I do not have independent columns for control or treated only, This is to make it easier to handle the contrasts, in which case I am combining multiple columns if I just want to compare control vs treated. But I digress.

My question is the following: I want to analyze the coefficients at different times or treatments. For this I can extract my fit$coefficients object resulting from lmFit, but then I get coefficients with the same columns as my design matrix, i.e. coefficients for columns control.t1, treated.t1, control.t2, ..... But I want to extract a representative coefficient of t1, t2, etc irrespective of treatment. For instance, I want to take the coefficients at time 1 so both control.t1 and treated.t1, but I'm not sure how I can combine the coefficients. I.e. I want to analyze the t1 coefficient irrespective of treatment.

I'm guessing that it should be possible to alter the design matrix to represent the individual effects but it would fail since the column for t1 would basically be given by control.t1 + treatment.t1 (and the same for all time points), in which case the model would not be fittable. I'm also thinking that alternatively I could compute the mean of the coefficients control.t1 and treatment.t1, but not sure how accurate this would be. Any suggestions?

limma r • 1.8k views

ADD COMMENT • link 3.2 years ago by fr ▴ 210

0

Entering edit mode

I've posted a related question, though it is a different question here.

ADD REPLY • link 3.2 years ago by fr ▴ 210

score 1 · Answer 1 · 2021-05-14

1

Entering edit mode

3.2 years ago

James W. MacDonald ▴ 20

It's not entirely clear what you are after, but do note that you can always use contrasts to do most of what you want. Also please note that the longstanding recommendation for this sort of analysis is to make groups based on a combination of the underlying phenotypes and then fit a cell means model, after which you can easily make any required comparison without having to interpret the coefficients. So you would do

combo <- paste(cell_type, time, treatment, sep = ".")
design <- model.matrix(~ 0 + combo)

and proceed from there. Each coefficient then computes the mean of the group (e.g., the cell1.control.t1 group), and if you want to make any comparison it's simple to specify with a contrast matrix.

ADD COMMENT • link 3.2 years ago by James W. MacDonald ▴ 20

0

Entering edit mode

Thanks a lot James, I understand. I'm not after the LFC or p-values, but rather those coefficients that you just mentioned with respect to each element of combo. I.e. I'd like to find the coefficients for time t1, time t2, etc for each gene as estimated by lmFit. This is what is confusing me here: if I didnt have the interaction, I could get those coefficients easily by specifying

design = model.matrix(~0 + time + treatment, some_df)
fit=lmFit(input, design)
my_coefficients=fit$coefficients

In this case, my_coefficients would give me each gene's fitting coefficient for each time point and treatment individually (all good). But in my real case I have the interaction of time and treatment, so I cannot get coefficients for t1 alone even if I now from the model matrix that a column t1 = control.t1 + treated.t1. My main question is that I don't know how to retrieve these coefficients based on combinations of individual interactions, Is the coefficient for a gene at t1 given by the mean of coefficients at control.t1 + treated.t1?

ADD REPLY • link 3.2 years ago by fr ▴ 210