Question

Deseq2 results have too many genes

0

Entering edit mode

6.2 years ago

hkarakurt ▴ 180

Hello everyone, I am trying to validate a RNA-Seq analysis pipeline but whatever I do (like same trim and alignment parameters) I have more (like 2 time more mostly) differentially expressed genes than the results of examples and published articles. I am using FeatureCounts count matrix and default parameters for DESeq() command in R.

What can the main reason of this?

Thank you

Deseq2 RNA-Seq Differential Expression • 2.1k views

ADD COMMENT • link updated 6.2 years ago by Pin.Bioinf ▴ 340 • written 6.2 years ago by hkarakurt ▴ 180

0

Entering edit mode

Deseq2 uses htseq (if i remember correctly), so therefore you may get different count data. I recommend you to compare the count data if you can.

ADD REPLY • link 6.2 years ago by arta ▴ 670

0

Entering edit mode

I also compare the LogFC values and they are really close. Only problem is the p-values but I will check for your advice.

ADD REPLY • link 6.2 years ago by hkarakurt ▴ 180

score 0 · Answer 1 · 2018-02-20

0

Entering edit mode

6.2 years ago

Pin.Bioinf ▴ 340

Are you using the same p adjust value threshold as them in the RNAseq analysis? Could you add the DESeq pipeline here?

ADD COMMENT • link 6.2 years ago by Pin.Bioinf ▴ 340

0

Entering edit mode

Yes I am using same threshold. Mostly adjust p-value 0.01 is threshold. And sometimes I see some genes in my results file which are not in the article results even I use same genome and GTFs. Do same annotation files (hg19 for example) have more genes because of updates or something?

My script is here:

countdata <- read.table("file", header=TRUE, row.names=1)

countdata <- countdata[ ,6:ncol(countdata)]

colnames(countdata) <- gsub("\.[sb]am$", "", colnames(countdata))

countdata <- as.matrix(countdata)

(condition <- factor(c(rep("ctl", 2), rep("exp", 2))))

library(DESeq2)

(coldata <- data.frame(row.names=colnames(countdata), condition)) dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~condition) dds

dds <- DESeq(dds)

res <- results(dds) table(res$padj<0.05) res <- res[order(res$padj), ] resdata <- merge(as.data.frame(res), as.data.frame(counts(dds, normalized=TRUE)), by="row.names", sort=FALSE) names(resdata)[1] <- "Gene" head(resdata) write.csv(resdata, file="results.csv")

Thank you.

ADD REPLY • link 6.2 years ago by hkarakurt ▴ 180