Question: DESeq2 design matrix
1
gravatar for mickey_95
11 months ago by
mickey_9510
mickey_9510 wrote:

Hi,

I am trying to use DESeq2 for differential analysis of ATAC-seq data. I have performed peak calling, obtained a consensus peak data set and, using featureCounts, generated a count matrix with my samples as columns and the peak coordinates as rows. I have two conditions, 4 biological and 2 technical replicates each; so 16 samples in total. I am interested in the differences between my two conditions, so I was using DESeq2::DESeqDataSetFromMatrix with

design = ~condition

Now I am wondering about two things: 1. Should I collapse my technical replicates before running the DE pipeline? 2. Should I also include my biological replicates in the design formula? And also an interaction term (condition:BioRep)?

I am very new to this, so apologies if this is a very basic question. Appreciate input and help!

atac-seq de analysis deseq2 • 619 views
ADD COMMENTlink modified 11 months ago by ATpoint39k • written 11 months ago by mickey_9510
1

For collapsing technical replicates in DESeq2 workflow, use collapseReplicates (https://www.rdocumentation.org/packages/DESeq2/versions/1.12.3/topics/collapseReplicates) function.

ADD REPLYlink written 11 months ago by cpad011214k
1
gravatar for dsull
11 months ago by
dsull1.6k
UCLA
dsull1.6k wrote:

Yes, I would pool technical replicates together since you really are just interested in the biological reproducibility.

Your biological replicates don't need to be in the design formula. You have two conditions and each biological replicate will be assigned to one of the two conditions (e.g. 4 treated and 4 control).

ADD COMMENTlink written 11 months ago by dsull1.6k
1
gravatar for ATpoint
11 months ago by
ATpoint39k
Germany
ATpoint39k wrote:
  1. Should I collapse my technical replicates before running the DE pipeline

If technical means sequencing the exact same DNA on different lanes or flow cells then yes, collapse them. You can run plotPCA after normalizing the count matrix with vst to see if they indeed cluster close together (or are more or less identical) to confirm there is nothing odd going on.

  1. Should I also include my biological replicates in the design formula? And also an interaction term (condition:BioRep)? No, the replication itself is listed in the colData that describes your experiment like:
Sample  Condition
foo_rep1    condition1
foo_rep2    condition1
foo_rep3    condition1
bar_rep1    condition2
bar_rep2    condition2
bar_rep3    condition2

The design would be ~Condition and this already contains the replication information. I suggest you go through the DESeq2 (and maybe the edgeR) manual as these have extensive examples on how to make designs plus some useful background infmrmation on differential analysis.

ADD COMMENTlink modified 11 months ago • written 11 months ago by ATpoint39k

Thank you for the quick reply! My technical replicates cluster closely together in a PCA plot, so I will collapse them. I ran de DESeq2 pipeline with the design you suggested and am very surprised to see that no differentially accessible regions pop up. This seems very strange to me and I do not understand what I might be doing wrong. I have been through the manuals of both DESeq2 and edgeR and understand the basics of the theoretical concepts behind them.

ADD REPLYlink written 10 months ago by mickey_9510

Well, the reason might be that either there are indeed no DEGs or 2) there is not enough power (=not enough replicates). Can you show the outputs of plotPCA and plotMA? What kind of samples are this, primary or cell line, species, setup?

ADD REPLYlink modified 10 months ago • written 10 months ago by ATpoint39k

These are primary cells from mice, control and drug-treated. The samples are all over the place in the PCA plot, there is no clear clustering of the two groups (ctrl and treated). I guess this variability could be the explanation?

ADD REPLYlink written 10 months ago by mickey_9510

Yes, that sounds reasonable towards why there are no DEGs. A look at the MA-plot could help decide if there might be a chance to get DEGs at higher replicate numbers.

ADD REPLYlink modified 10 months ago • written 10 months ago by ATpoint39k

I don't have access to the data at the moment, I will try to upload a picture later during the week. But in the meantime: How would I assess whether the experiment is underpowered using the MA plot?

ADD REPLYlink written 10 months ago by mickey_9510
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1886 users visited in the last hour