Question

edgeR steps for gene expressions

1

Entering edit mode

6.4 years ago

Tania ▴ 180

Hello Everyone

I am new to edgeR. I am expecting weird gene expressions. I want to double check my code steps before discovering where else the problem could be. Do these steps make sense to you?

txi <- tximport(files, type = "salmon", tx2gene = tx2gene,ignoreTxVersion = TRUE)
write.csv(txi$counts, "counts.csv")
cts <- txi$counts

First 10 samples are control second 10 samples are tumor:

group = factor(c(rep("Control", 10), rep("Tumor",10)) ) 
dge = DGEList(counts=cts, genes= rownames(data), group=group)
countsPerMillion <- cpm(dge)
summary(countsPerMillion)
countCheck <- countsPerMillion > 1
summary(countCheck)
keep <- which(rowSums(countCheck) >= 2)
dge <- dge[keep,]
summary(cpm(dge))
dge <- calcNormFactors(dge, method="TMM")
dge <- estimateCommonDisp(dge)
dge <- estimateTagwiseDisp(dge)
dge <- estimateTrendedDisp(dge)
et <- exactTest(dge, pair=c("Control", "Tumor"))
etp <- topTags(et, n=100000)
write.csv(etp$table, "dge.csv")

Thanks

RNA-Seq • 2.2k views

ADD COMMENT • link updated 6.4 years ago by theobroma22 ★ 1.2k • written 6.4 years ago by Tania ▴ 180

0

Entering edit mode

I am expecting weird gene expressions.

So are you getting normal results instead :)

On a serious note, what is unexpected about the results you are getting?

ADD REPLY • link 6.4 years ago by GenoMax 141k

0

Entering edit mode

=D Sorry I mean found weird results :) what I expected to be enriched in tumor are enriched in control. I want to make sure code is right before investigating more.

ADD REPLY • link 6.4 years ago by Tania ▴ 180

0

Entering edit mode

Why didn't you just use estimateDisp? It estimates Common, tagwise and trended dispersions in one run :D, also I would change the value of topTags, since the default value seems to be 1, you would really be getting a list of the first 100,000 genes sorted by pvalue.

ADD REPLY • link 6.4 years ago by biofalconch ★ 1.1k

0

Entering edit mode

Ok, Thanks biofalconch :) Is this line correct?

"group = factor(c(rep("Control", 10), rep("Tumor",10)) ) "

ADD REPLY • link 6.4 years ago by Tania ▴ 180

1

Entering edit mode

It seems alright to me, it really depends on the counts matrix though. A PCA might be useful in this case

ADD REPLY • link 6.4 years ago by biofalconch ★ 1.1k

score 0 · Answer 1 · 2017-11-22

0

Entering edit mode

6.4 years ago

theobroma22 ★ 1.2k

Something doesn’t seem right to me up top...did you paste all of your code?

I could be wrong but where is your “data” object for the “dge” object in:
dge = DGEList(counts=cts, genes= rownames(data), group=group)...?