Question: Differential Expression with Ballgown
gravatar for dec986
2.3 years ago by
United States
dec986230 wrote:


I've been attempting to run the Ballgown differential expression package, but am getting odd results. I have an experiment where a particular gene is knocked down, but it isn't showing as differentially expressed with Ballgown. I am comparing the results to EdgeR, and they are not coming out the same.

I've come up with the following script based on Alyssa Frazee's GitHub page and

samples <- c("/dir/A","dir/B","dir/C", "dir/D")
bg = ballgown(samples=samples, meas="all")
bg_filtered <- subset(bg, "rowVars(texpr(bg)) > 1", genomesubset = TRUE)
pData(bg_filtered) = data.frame(id=sampleNames(bg_filtered), group=c(rep(0,2), rep(1,2)))
results_transcripts = stattest(bg_filtered, feature="transcript", meas="FPKM", getFC = TRUE, covariate="group")
results_transcripts <- data.frame(transcriptNames=transcriptNames(bg_filtered), results_transcripts)
results_transcripts <- results_transcripts[order(results_transcripts$qval),]#sort by qval
write.table(file = "ballgown_output/J1_and_J2яEarlyяH1_H2_and_H5_vs_control.tsv", results_transcripts, sep="\t", quote=FALSE)

my question is whether I'm missing anything, I don't know why these results are turning out to be so different from one another. Why is the knockdown transcript not showing as differentially expressed?

rna-seq R • 2.0k views
ADD COMMENTlink modified 2.2 years ago by Biostar ♦♦ 20 • written 2.3 years ago by dec986230

did you try with other meas (cov) for transcript?. Look at the raw counts for the knocked down gene between samples/groups. dec986

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by cpad011214k

yes, thanks, I looked at the stringtie output, and it appears that the problem is there. All key transcripts show 0 coverage. This is strange because featureCounts on the same bam files worked fine. I ran stringtie like stringtie $bam -G $gtf -A $tab -o $out -B 2> $err There is a wealth of options available for running stringtie, e.g. -c minimum reads per bp coverage (this still doesn't show the key transcript) among others, but which ones can I modify to detect the transcripts without ruining the output?

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by dec986230

the issue that started this post was that stringtie has to be run with the "-e" option, without which the important transcripts will not appear.

ADD REPLYlink written 2.3 years ago by dec986230

Maybe it does not come out as significant due to large variation between the replicates. You might also consider to switch tools, as ballgown does not seem to be maintained anymore. It also still uses FPKM (EDIT: uses FPKM by default, maybe one can also feed in other counts, I did never check), which is considered inferior for sample comparisons. As you probably do not need the transcript level abundances, why not using well-maintained gene level solutions like edgeR or DESeq2?

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by ATpoint42k

Ballgownn analysis accepts counts via meas

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by cpad011214k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 844 users visited in the last hour