Question: How many genes are expressed in each condition: control / treatment group
0
gravatar for ek699
3 months ago by
ek6990
ek6990 wrote:

Hi I am working on RNA-seq differential expression analysis using DESeq2 in R. My code looks like:

countMatrix2 = as.matrix(read.table("countMatrixName.txt", header = TRUE, row.names = 1))
condition2 <- factor(rep(c("Treatment", "Control"), each=4))
colData2 <- data.frame(condition2)
rownames(colData2)<- colnames(countMatrix2)

dds2<-DESeqDataSetFromMatrix(countMatrix2, colData2, ~ condition2)

# Pre-filtering 
dds2 <- dds2[rowSums(counts(dds2))>=10,]
# Setting the factor levels
dds2$condition2 <- relevel(dds2$condition2, ref = "Control")

# Differential Expression Analysis
dds_des2 <- DESeq(dds2)
# Getting result table with alpha = 0.05
res0052 <- results(dds_des2, alpha = 0.05)

From here, I want to know how many genes are expressed (for example simply count > 0) separately in Control group and Treatment group. I am confused because the result table only shows Control vs Treatment information like log2foldchange, p-value, etc. How can I separately observe each condition: Control group and Treatment group? This is my first time to use deseq2, and any thoughts are greatly helpful. Thank you!

ADD COMMENTlink modified 3 months ago by ATpoint36k • written 3 months ago by ek6990
0
gravatar for ATpoint
3 months ago by
ATpoint36k
Germany
ATpoint36k wrote:

For expression levels you have to check the counts itself, so counts(dds2) or counts(dds2, normalized=TRUE). The results only contain the pairwise comparison, so fold changes and significances, average expression.

ADD COMMENTlink modified 3 months ago • written 3 months ago by ATpoint36k

Thanks for the information! But, when I use counts(dds2, normalized=TRUE) then it might still contain the data that will be removed after I apply the function DESeq() and remove NAs. Is it still a valid result? Or, do I need to remove the NAs only for the differentially expressed genes? And the number of expressed genes for each group: Condition and Treatment, should I just do counts(dds2[,1:4], normalized=TRUE) and counts(dds2[,5:8], normalized=TRUE) ??

ADD REPLYlink modified 3 months ago • written 3 months ago by ek6990

What exactly do you want to do? The counts are based on all genes that were in your reference that you quantified against. DESeq2 will remove genes during its independent filtering. What do you mean by valid? Considering every gene that has > 1 count as expressed in inaccurate since there is background transcription that is not biologically meaningful, alignment errors, sequencing noise etc. Which question do you want to answer?

ADD REPLYlink written 3 months ago by ATpoint36k

As I addressed in my question above, I literally want to know how many genes are expressed separately in Control group and Treatment group from my code above. I also need to define the threshold for "expressed genes", for example, any gene with count > 0 or any gene with count > 0 in more than one sample. When I did results(dds_des2, alpha = 0.05), I was still able to see there are a few rows with adjusted p-value = 0, which means, even after DESeq2 applies the independent filtering, it still keeps the rows having a low mean normalized count by setting its adjusted p-value = 0.

ADD REPLYlink modified 3 months ago • written 3 months ago by ek6990

cutoff for expressed genes in DESeq2 plus many other threads via the search function.

ADD REPLYlink written 3 months ago by ATpoint36k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 680 users visited in the last hour