Hi, we did an RNA seq experiment recently. It was 3 replicates for 2 conditions. One replicate turned out to be contaminated so I have eliminated it. I want to do differential expression analysis.
I have a script for EdgeR and it works fine, however I am getting way too many upregulated genes. When I do a summary EdgeR tells me there are 83 upregulated genes, 13 downregulated. This does not really make sense and there is nothing about either of these two conditions that should cause a large activation / deactivation of transcription. Im wondering what I can do to correct this. How do I go about diagnosing if theres anything wrong with my replicates and what can be done to correct for it.
I think the dispersion was a little high Disp = 0.10443 , BCV = 0.3232. Ive tried fiddling with the counts per million cutoff and it helps a bit but Im wondering if I can do anything else
targets <- readTargets() x <- read.delim("EdgeR.counts.m.txt", row.names=1, stringsAsFactors=FALSE) y <- DGEList(counts=x[,1:5], group=targets$Treatment) colnames(y) <- targets$Label keep <- rowSums(cpm(y)>5) >= 3 y <- y[keep,] dim(y) y$samples$lib.size <- colSums(y$counts) y <- calcNormFactors(y) y$samples plotMDS(y) y <- estimateCommonDisp(y, verbose=TRUE) y <- estimateTagwiseDisp(y) plotBCV(y) wm <- exactTest(y, pair=c("wt","mt")) summary(de <- decideTestsDGE(wm)) top <- topTags(wm) top