Question: EdgeR generic design matrix
0
gravatar for rfenouil
14 months ago by
rfenouil0
rfenouil0 wrote:

Hello all,

I have some very generic/naive questions regarding EdgeR. My apologies if these questions were already asked, I could not find them answered clearly on the forum.

1. Build a generic design matrix

From EdgeR's documentation, in chapter "3.3 Experiments with all combinations of multiple factors", I read: "A simple, multi-purpose approach is to combine all the experimental factors into one combined factor"

If I understand correctly, this approach allows to specify a generic design matrix (only replicates are identified as such). Then, comparisons between conditions of interest are carried out using contrasts during statistical testing procedure. Does that sound corrrect ?

Does that mean that design matrix information is not required during estimation of dispersion ? In other terms, could you confirm that using this approach is theoretically equivalent to the classic approach where conditions are separated in design matrix ?

A systematic use of this strategy would help me for automatization of an analysis, and I was wondering whether it makes sense or not.

2. QC plots

While 'playing' with edgeR, I generated the attached plots. I believe these plots give important information about quality for downstream processes and I think it might be important to provide them for every analysis.

Unfortunately, the details of EdgeR's method are too elaborated for my understanding. I would like to know if there is a simple way to explain what is important to look for in them.

By reading documentation, I made myself a representation of what they mean but it is for sure incomplete and likely to be incorrect... I would appreciate a piece of advice from experts. Apologies if figures title/axes/legend don't make sense, I made some of them from what I thought I understood...

Thank you very much for your help.

Figures

edger rna-seq • 512 views
ADD COMMENTlink modified 14 months ago by Devon Ryan91k • written 14 months ago by rfenouil0
0
gravatar for Devon Ryan
14 months ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

If I understand correctly, this approach allows to specify a generic design matrix (only replicates are identified as such). Then, comparisons between conditions of interest are carried out using contrasts during statistical testing procedure. Does that sound correct ?

Depends on how generic generic is. Basically have a design of ~0 + group, where the levels of group can be something like: 'WT untreated', 'WT treated', 'Mut untreated', 'Mut treated', to give a simple example.

Does that mean that design matrix information is not required during estimation of dispersion ?

No, you need to know your groups either way. You can only automate so much, since inevitably more complicated designs mean that only some comparisons will be of interest.

Regarding the plots, the absolute simplest explanation is that (A) you want to ensure that the trend lines actually fit the points and (B) that the squeezed variances move in a reasonable direction (toward the fit or the NB mean-variance relationship).

ADD COMMENTlink written 14 months ago by Devon Ryan91k

Hello and thank you !

Ok ~0 + group is what I was thinking about (generic design matrix), with groups defining combinations of experimental conditions as in you example. Then, that should allow me to compare WT vs Mut using contrasts (1, 1, -1, -1), as well as Untreated vs Treated using a different contrast matrix (1, -1, 1, -1) when applying statistical test. Is that approach correct and equivalent to defining separate factors for WT/Mut and Treated/Untreated in design matrix ?

If so, this would be enough for my 'automation' needs.

ADD REPLYlink modified 14 months ago • written 14 months ago by rfenouil0

Yup, that'd be the equivalent.

ADD REPLYlink written 14 months ago by Devon Ryan91k

Awesome, thank you very much for your help.

ADD REPLYlink written 14 months ago by rfenouil0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 805 users visited in the last hour