Question: How to collapseReplicate in DESeq2
1
3.6 years ago by
macmath140
France
macmath140 wrote:

After going through the manual I couldn't proceed with collapse replicates

``````sampleFiles <- list.files(path = "./", pattern = ".counts")
sampleNames <- gsub(".counts", "", sampleFiles)
sampleCondition <- c(rep("KO1", 4),rep("KO2", 4),rep("KO3", 4), rep("WT1", 4),rep("WT2", 4),rep("WT3", 4))
sampleTable <- data.frame(sampleName = sampleNames, fileName = sampleFiles, condition = sampleCondition)
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
directory = directory,
design = ~ condition)
treatments <- c("KO1", "KO2", "KO3", "WT1", "WT2", "WT3")
library("DESeq2")
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,design = ~ condition)
colData(ddsHTSeq)\$condition <- factor(colData(ddsHTSeq)\$condition,
levels = treatments)
#Analysis using DESeq
dds <- DESeq(ddsHTSeq)
resultsNames(dds)
#Pre-filtering
dds <- dds[ rowSums(counts(dds)) > 1, ]
res <- results(dds)
#summarise some basic tallies
summary(res)
``````

This was the example from collapseReplicates, I am not sure where should I apply this step and how should I proceed with my dataset

``````dds <- makeExampleDESeqDataSet(m=12)

# make data with two technical replicates for three samples
dds\$sample <- factor(sample(paste0("sample",rep(1:9, c(2,1,1,2,1,1,2,1,1)))))
dds\$run <- paste0("run",1:12)

ddsColl <- collapseReplicates(dds, dds\$sample, dds\$run)

# examine the colData and column names of the collapsed data
colData(ddsColl)
colnames(ddsColl)

# check that the sum of the counts for "sample1" is the same
# as the counts in the "sample1" column in ddsColl
matchFirstLevel <- dds\$sample == levels(dds\$sample)[1]
stopifnot(all(rowSums(counts(dds[,matchFirstLevel])) == counts(ddsColl[,1])))
``````
rna-seq • 4.3k views
written 3.6 years ago by macmath140

Your code is a bit confusing because you call `DESeqDataSetFromHTSeqCount` twice with different parameters. Can you clean this up and post the output of `colData(ddsHTSeq)` ?

Please correct me if I am wrong in any point, if I need to make any changes

The output of colData(ddsHTSeq)

``````DataFrame with 24 rows and 1 column
condition
<factor>
...             ...
``````
5
3.6 years ago by
Carlo Yague5.5k
Carlo Yague5.5k wrote:

In order to use `collapseReplicates`, you need a colData like this :

``````         condition   sample         run
...
``````

Then you will collapse your "runs" (technical replicates) at the level of your samples (biological replicates) :

``````ddsColl <- collapseReplicates(ddsHTSeq, ddsHTSeq\$sample, ddsHTSeq\$run)
``````

Thank you very much Carlo Yague for your help and suggestion. It worked absolutely fine.

Could you suggest me regarding Time point analysis how should I proceed and what should be the design? Should I collapse replicates during that process. Waiting anxiously for anyones help!!!

1

I have moved the reaction of Carlo to an answer so you can accept it to mark this question as solved.

Could you suggest me regarding Time point analysis how should I proceed and what should be the design? Should I collapse replicates during that process. Waiting anxiously for anyones help!!!

This sounds like a separate question and as such should get a separate thread. Also note that Bioconductor Support might be the most appropriate place for these questions (although you are obviously free to ask on Biostars. Just pick one and one only for your question.)

1

Should you collapse replicates ? Probably yes (for technical replicates).

How should be the design ? Look here for examples of time course analysis in DESeq2.

Hope this helps. If you need a more elaborated answer, create a new question with all the details, as suggested by WouterDeCoster.