Question: Help defining contrasts in DESeq.
gravatar for ariel.balter
12 months ago by
ariel.balter140 wrote:

I'm working on a case/control study and including some other variables such as genotypes. One thing I want to do is a subgroup analysis of the genotypes for just the case group. The brute force method is to subset my data to the case group and run deseq on that. I'm very new to deseq and the very idea of contrasts. But from reading it seems like contrasts offer a more elegant way to do subgroup analysis without having to rerun deseq multiple times on differently subsetted data.

Suppose I have the following variables:

  • Case: [Case, Control]
  • GeneA: [YY, YN, NN]
  • GeneB: [YY, YN, NN]

I run deseq with the design "~ Case + GeneA + GeneB + GeneA:GeneB"

How would I write contrasts that would be the equivalent of

  1. Subgroup just those with the disease (Case="Case")
  2. Subgroup just homozygous genotypes: (GeneA = [YY, NN], GeneB=[YY,NN])
  3. Ask the question: Is the strongest effect due to GeneA, GeneB, or the interaction?

So, for instance, given my design, to subgroup Case=="Case", my contrast would be contrast=c(?,?,?), etc.


Here is a putative study design to work with:

> fake_study_design = data.frame(
+     Case=sample(c('Case', 'Control'), 10, replace=T), 
+     GeneA=sample(c('YY','YN','NN'), 10, replace=T), 
+     GeneB=sample(c('YY', 'YN', 'NN'), 10, replace=T)
+ )
> fake_study_design
      Case GeneA GeneB
1     Case    YY    YY
2     Case    YN    NN
3  Control    YY    NN
4  Control    YY    YN
5  Control    YN    YY
6  Control    NN    NN
7  Control    NN    YY
8     Case    NN    YN
9  Control    YY    NN
10 Control    YY    YY
deseq contrasts R • 441 views
ADD COMMENTlink written 12 months ago by ariel.balter140

Have you searched for the answer? DESeq2 contrasts is one of the most common questions both here and on Bioconductor support forum. The vignette also has all information that you need. Go to a search engine and search for deseq2 vignette. A useful skill in bioinformatics is to know how to seek out information.

ADD REPLYlink written 12 months ago by Kevin Blighe55k

@Kevin-- Sorry if I seemed too noob for you. I'm pretty well versed in reading manuals. I read the vignette section on contrasts many times. It shows how to pull out the log fold change for two levels of a given covariate. Not how to subset an entire covariate level. Google has not helped me either with a general discussion deseq contrast design. On the other hand, from the sound of your indignation, you must be an expert in this, and could probably help me through the answer. So, please proceed.

ADD REPLYlink written 12 months ago by ariel.balter140

I am not an 'expert' of DESeq2 - I am an experienced end-user of it. If anyone other than the developer (Michael Love) calls themselves an expert of DESeq2, then you know that they are lying.

You will have to clarify what you mean when you type 'subgroup' ... ? Usually people want to compare, e.g., Case versus Control. Why have you even got Controls if you are not intending to use them? It looks like you will require a merged variable for GeneAGeneB and then have an interaction between it and Case. You could just clarify what you mean by 'subgroup', though.

ADD REPLYlink modified 12 months ago • written 12 months ago by Kevin Blighe55k

The point is that we already have the data from the patients. The 0th order question is Case vs. Control. However, we also want to look at the effect of the genes in just the subgroup with the disease (Case). Just one of the many questions we want to ask. We have about a dozen genotypes to look at as well as other variables. Rather than subsetting my data N times and running deseq on each subset, if I can use contrasts to pull out the comparisons I want from a deseq object that is more efficient. I agree that I would need a merged variable based on the "interactions" section in the vignette. I can't figure out how to set one up. Not enough information or examples in the vignette. Have you ever done it?

ADD REPLYlink written 12 months ago by ariel.balter140

Hey, I think that I know what you mean now - thanks! I have limited time right now but will look again in a couple of hours. I think that you can create the interaction for Case:GeneAGeneB, and then select out different contrasts involving just the cases.

In such a case, your contrast could be something like:

results(dds, contrast=list("CaseCase.GeneAGeneBYYYY", "CaseCase.GeneAGeneBNNNN"))

The vignette in the interactions part probably could indeed be expanded. However, there are further examples given in the manual page. Take a look at the very bottom of the manual entry for ?results

ADD REPLYlink written 12 months ago by Kevin Blighe55k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1470 users visited in the last hour