Question

Disproportionate Amount of Up-regulated Genes

0

Entering edit mode

8.6 years ago

Stoploss25 ▴ 10

Hi, we did an RNA seq experiment recently. It was 3 replicates for 2 conditions. One replicate turned out to be contaminated so I have eliminated it. I want to do differential expression analysis.

I have a script for EdgeR and it works fine, however I am getting way too many up-regulated genes. When I do a summary EdgeR tells me there are 83 up-regulated genes, 13 down-regulated. This does not really make sense and there is nothing about either of these two conditions that should cause a large activation / deactivation of transcription. I'm wondering what I can do to correct this. How do I go about diagnosing if there's anything wrong with my replicates and what can be done to correct for it.

I think the dispersion was a little high Disp = 0.10443 , BCV = 0.3232. Ive tried fiddling with the counts per million cutoff and it helps a bit but I'm wondering if I can do anything else

targets <- readTargets()
x <- read.delim("EdgeR.counts.m.txt", row.names=1, stringsAsFactors=FALSE)
y <- DGEList(counts=x[,1:5], group=targets$Treatment)
colnames(y) <- targets$Label
keep <- rowSums(cpm(y)>5) >= 3
y <- y[keep,]
dim(y)
y$samples$lib.size <- colSums(y$counts)
y <- calcNormFactors(y)
y$samples
plotMDS(y)
y <- estimateCommonDisp(y, verbose=TRUE)
y <- estimateTagwiseDisp(y)
plotBCV(y)
wm <- exactTest(y, pair=c("wt","mt"))
summary(de <- decideTestsDGE(wm))
top <- topTags(wm)
top

RNA-Seq differential-expression edgeR • 2.6k views

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by Stoploss25 ▴ 10

Ram · Answer 1 · 2015-09-29

1

Entering edit mode

8.6 years ago

karl.stamm 4.1k

You think 83 genes is a lot? How many are in the counts table? Human's got about 20-30 thousand. You'd expect more than 83 by random noise.

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by karl.stamm 4.1k

0

Entering edit mode

Yes I do think 83 >> 13. You are disagreeing with this?. Even if it was 'random noise' you would expect it to be equal, if there was nothing else going on.

ADD REPLY • link updated 19 months ago by Ram 43k • written 8.6 years ago by Stoploss25 ▴ 10

0

Entering edit mode

Oh I misunderstood. You're concerned with the imbalance. I don't think it's a problem, but I also don't have a good justification for you. Possibly one pathway is activated and another deactivated, and one happens to carry 83 genes, the other 13. You're sensitive to our definitions and discoveries of gene IDs.

ADD REPLY • link updated 19 months ago by Ram 43k • written 8.6 years ago by karl.stamm 4.1k

Ram · Answer 2 · 2015-09-29

1

Entering edit mode

8.6 years ago

Devon Ryan 104k

Just because there's no a priori reason to expect an imbalance in up/down regulated genes doesn't mean there won't be one.
Try doing some independent filtering with the genefilter package. That'll let you maximize the number of DE genes. Since you mentioned that increasing cutoffs decreases the imbalance it could simply be that you're better able to measure up-regulated vs. down-regulated genes...so the p-values for the down-regulated genes could just be a bit deflated from what they might otherwise be.

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks will look into this

ADD REPLY • link 8.6 years ago by Stoploss25 ▴ 10