Question: Differential Expression with Ballgown
gravatar for dec986
13 months ago by
United States
dec986200 wrote:


I've been attempting to run the Ballgown differential expression package, but am getting odd results. I have an experiment where a particular gene is knocked down, but it isn't showing as differentially expressed with Ballgown. I am comparing the results to EdgeR, and they are not coming out the same.

I've come up with the following script based on Alyssa Frazee's GitHub page and

samples <- c("/dir/A","dir/B","dir/C", "dir/D")
bg = ballgown(samples=samples, meas="all")
bg_filtered <- subset(bg, "rowVars(texpr(bg)) > 1", genomesubset = TRUE)
pData(bg_filtered) = data.frame(id=sampleNames(bg_filtered), group=c(rep(0,2), rep(1,2)))
results_transcripts = stattest(bg_filtered, feature="transcript", meas="FPKM", getFC = TRUE, covariate="group")
results_transcripts <- data.frame(transcriptNames=transcriptNames(bg_filtered), results_transcripts)
results_transcripts <- results_transcripts[order(results_transcripts$qval),]#sort by qval
write.table(file = "ballgown_output/J1_and_J2яEarlyяH1_H2_and_H5_vs_control.tsv", results_transcripts, sep="\t", quote=FALSE)

my question is whether I'm missing anything, I don't know why these results are turning out to be so different from one another. Why is the knockdown transcript not showing as differentially expressed?

rna-seq R • 937 views
ADD COMMENTlink modified 12 months ago by Biostar ♦♦ 20 • written 13 months ago by dec986200

did you try with other meas (cov) for transcript?. Look at the raw counts for the knocked down gene between samples/groups. dec986

ADD REPLYlink modified 13 months ago • written 13 months ago by cpad011211k

yes, thanks, I looked at the stringtie output, and it appears that the problem is there. All key transcripts show 0 coverage. This is strange because featureCounts on the same bam files worked fine. I ran stringtie like stringtie $bam -G $gtf -A $tab -o $out -B 2> $err There is a wealth of options available for running stringtie, e.g. -c minimum reads per bp coverage (this still doesn't show the key transcript) among others, but which ones can I modify to detect the transcripts without ruining the output?

ADD REPLYlink modified 13 months ago • written 13 months ago by dec986200

the issue that started this post was that stringtie has to be run with the "-e" option, without which the important transcripts will not appear.

ADD REPLYlink written 13 months ago by dec986200

Maybe it does not come out as significant due to large variation between the replicates. You might also consider to switch tools, as ballgown does not seem to be maintained anymore. It also still uses FPKM (EDIT: uses FPKM by default, maybe one can also feed in other counts, I did never check), which is considered inferior for sample comparisons. As you probably do not need the transcript level abundances, why not using well-maintained gene level solutions like edgeR or DESeq2?

ADD REPLYlink modified 13 months ago • written 13 months ago by ATpoint23k

Ballgownn analysis accepts counts via meas

ADD REPLYlink modified 13 months ago • written 13 months ago by cpad011211k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1399 users visited in the last hour