Differential Expression with Ballgown
0
0
Entering edit mode
5.7 years ago
dec986 ▴ 370

Hello,

I've been attempting to run the Ballgown differential expression package, but am getting odd results. I have an experiment where a particular gene is knocked down, but it isn't showing as differentially expressed with Ballgown. I am comparing the results to EdgeR, and they are not coming out the same.

I've come up with the following script based on Alyssa Frazee's GitHub page and https://davetang.org/muse/2017/10/25/getting-started-hisat-stringtie-ballgown/

library(ballgown)
library(RSkittleBrewer)
library(genefilter)
library(dplyr)
library(devtools)
samples <- c("/dir/A","dir/B","dir/C", "dir/D")
bg = ballgown(samples=samples, meas="all")
bg_filtered <- subset(bg, "rowVars(texpr(bg)) > 1", genomesubset = TRUE)
rm(bg)
pData(bg_filtered) = data.frame(id=sampleNames(bg_filtered), group=c(rep(0,2), rep(1,2)))
results_transcripts = stattest(bg_filtered, feature="transcript", meas="FPKM", getFC = TRUE, covariate="group")
results_transcripts <- data.frame(transcriptNames=transcriptNames(bg_filtered), results_transcripts)
results_transcripts <- results_transcripts[order(results_transcripts$qval),]#sort by qval
write.table(file = "ballgown_output/J1_and_J2яEarlyяH1_H2_and_H5_vs_control.tsv", results_transcripts, sep="\t", quote=FALSE)

my question is whether I'm missing anything, I don't know why these results are turning out to be so different from one another. Why is the knockdown transcript not showing as differentially expressed?

RNA-Seq R • 3.5k views
ADD COMMENT
1
Entering edit mode

did you try with other meas (cov) for transcript?. Look at the raw counts for the knocked down gene between samples/groups. dec986

ADD REPLY
0
Entering edit mode

yes, thanks, I looked at the stringtie output, and it appears that the problem is there. All key transcripts show 0 coverage. This is strange because featureCounts on the same bam files worked fine. I ran stringtie like stringtie $bam -G $gtf -A $tab -o $out -B 2> $err There is a wealth of options available for running stringtie, e.g. -c minimum reads per bp coverage (this still doesn't show the key transcript) among others, but which ones can I modify to detect the transcripts without ruining the output?

ADD REPLY
0
Entering edit mode

the issue that started this post was that stringtie has to be run with the "-e" option, without which the important transcripts will not appear.

ADD REPLY
0
Entering edit mode

Maybe it does not come out as significant due to large variation between the replicates. You might also consider to switch tools, as ballgown does not seem to be maintained anymore. It also still uses FPKM (EDIT: uses FPKM by default, maybe one can also feed in other counts, I did never check), which is considered inferior for sample comparisons. As you probably do not need the transcript level abundances, why not using well-maintained gene level solutions like edgeR or DESeq2?

ADD REPLY
0
Entering edit mode

Ballgownn analysis accepts counts via meas

ADD REPLY

Login before adding your answer.

Traffic: 2004 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6