Question: edgeR make contrast
1
gravatar for Bnf83
4 months ago by
Bnf83130
Bnf83130 wrote:

Hi guys,

I'm dealing with a unusual analytical requirement: I'm performing differential gene expression of some human samples.

I have neither controls nor replicates. The samples are grouped into two groups, let's say: A and B. Now I have to compare A vs B+A. Normally I compare A vs B. Never happened A vs B+A. Using edgeR how can I do this contrast?

I usually use:

con <- makeContrasts(A - B, levels=design)

Thank you in advance

edger rna-seq contrasts • 390 views
ADD COMMENTlink modified 4 months ago by i.sudbery4.8k • written 4 months ago by Bnf83130
4

isn't A versus B+A just a contrast on B-versus-zero? All you'll get is a summary of average expression

ADD REPLYlink modified 4 months ago • written 4 months ago by russhh4.4k
3

Yes. If you wish to compare the expression in A to the expression in A plus the expression in B you are really testing if the expression in group B is zero:

H0: A = A + B

=> A - A = A + B -A => 0 = B

Do you mean that some samples have received treatment A and some samples have received treatment A and B?

ADD REPLYlink written 4 months ago by i.sudbery4.8k

Unfortunately they do not receive a treatment or a specific condition and for this reason it seems strange a requirement of this type to me. Anyway: I think they would like the relative expression value of A, i.e. a sort of delta of A over A+B. I cannot figure out the rationale.

ADD REPLYlink written 4 months ago by Bnf83130

I don't mean to pry, but you couldn't give us a bit more detail about the actual study could you? It may simply be that your collaborators have made a mistake in explaining what they want you to compare - they may be asking you to compare expression in the set A against expression in the set A u B, for example, IMO biologists / medics don't talk in terms of fitted coefficients.

ADD REPLYlink written 4 months ago by russhh4.4k

No problem!!! I have around 100 breast cancer samples (primary). They performed RNA seq and then they clustered the samples identifying clusters (that here I called groups) of patients. Then They asked me the comparison I already explained.

ADD REPLYlink written 4 months ago by Bnf83130

Were they clustered using one subset of the genes, and now you're running diffex on a separate subset of the genes?

ADD REPLYlink written 4 months ago by russhh4.4k

I think you need to talk to them about what their biological question is. There are screen methods where things like A/(A+B) are used as a measure of effect size, but you would test is this was equal to zero, not if A was equal to A + B. And I can't see this being a meaningful comparison in something like RNAseq.

ADD REPLYlink written 4 months ago by i.sudbery4.8k

I totally agree with you!

ADD REPLYlink written 4 months ago by Bnf83130

Your post is not a Tutorial, it is a Question, please use the appropriate category.

Let me see if I understood this correctly: you have one sample in group A, one sample in group B, and you want to compare A vs B+A? Does this even make sense?

ADD REPLYlink written 4 months ago by h.mon25k

I agree with you....anyway, suppose you have a gene "x" and you want to perform the DGE analysis on 10 samples of group A and 13 of group B. They asked me: DGA on A versus B plus A itself, i.e. if the expression of x in A is 40 and in B is 20, in A+B is 60, Finally the DGE will be 40 vs 60. Although it is mathematically clear, it is difficult to me to write properly the contrast.

ADD REPLYlink written 4 months ago by Bnf83130
2
gravatar for i.sudbery
4 months ago by
i.sudbery4.8k
Sheffield, UK
i.sudbery4.8k wrote:

From the discussions above, I believe you are being asked to test the difference between those samples in group A compared to an average of all the samples. This is, in fact, the traditional (before R) way to test contrasts.

This, it turns out, is actually fairly easy to code up, and simply relies on using a different contrast encoding system for your model matrix.

Create a conditions frame/factor for your groups (A or B) and set its contrasts model to contr.sum:

cluster <- factor(c("A", "A", "B", "B")
contrasts(cluster) <- contr.sum(2)

You can now create your model matrix as usual:

design = model.matrix(~ 1 + cluster)

When you fit your linear model, your will fit two coefficients, one is the intercept (that is effectively A+B) and the other is the difference for A (or membership of cluster A). There is no need to fit a contrast in your edgeR workflow, the coefficient of interest will be coef=2 in your glmLRT.

ADD COMMENTlink written 4 months ago by i.sudbery4.8k

You may also want to see this paper: https://www.biorxiv.org/content/10.1101/463265v1 Which explains why differential expression testing, post clustering, is a bad idea and has some suggestions for alternatives.

Personally, I'd try to bi-cluster, find the gene cluster that was driving the sample cluster and have a look what genes were in it.

ADD REPLYlink modified 4 months ago • written 4 months ago by i.sudbery4.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1613 users visited in the last hour