Question: Differential Expression with Ballgown
0
gravatar for dec986
19 months ago by
dec986210
United States
dec986210 wrote:

Hello,

I've been attempting to run the Ballgown differential expression package, but am getting odd results. I have an experiment where a particular gene is knocked down, but it isn't showing as differentially expressed with Ballgown. I am comparing the results to EdgeR, and they are not coming out the same.

I've come up with the following script based on Alyssa Frazee's GitHub page and https://davetang.org/muse/2017/10/25/getting-started-hisat-stringtie-ballgown/

library(ballgown)
library(RSkittleBrewer)
library(genefilter)
library(dplyr)
library(devtools)
samples <- c("/dir/A","dir/B","dir/C", "dir/D")
bg = ballgown(samples=samples, meas="all")
bg_filtered <- subset(bg, "rowVars(texpr(bg)) > 1", genomesubset = TRUE)
rm(bg)
pData(bg_filtered) = data.frame(id=sampleNames(bg_filtered), group=c(rep(0,2), rep(1,2)))
results_transcripts = stattest(bg_filtered, feature="transcript", meas="FPKM", getFC = TRUE, covariate="group")
results_transcripts <- data.frame(transcriptNames=transcriptNames(bg_filtered), results_transcripts)
results_transcripts <- results_transcripts[order(results_transcripts$qval),]#sort by qval
write.table(file = "ballgown_output/J1_and_J2яEarlyяH1_H2_and_H5_vs_control.tsv", results_transcripts, sep="\t", quote=FALSE)

my question is whether I'm missing anything, I don't know why these results are turning out to be so different from one another. Why is the knockdown transcript not showing as differentially expressed?

rna-seq R • 1.3k views
ADD COMMENTlink modified 18 months ago by Biostar ♦♦ 20 • written 19 months ago by dec986210
1

did you try with other meas (cov) for transcript?. Look at the raw counts for the knocked down gene between samples/groups. dec986

ADD REPLYlink modified 19 months ago • written 19 months ago by cpad011212k

yes, thanks, I looked at the stringtie output, and it appears that the problem is there. All key transcripts show 0 coverage. This is strange because featureCounts on the same bam files worked fine. I ran stringtie like stringtie $bam -G $gtf -A $tab -o $out -B 2> $err There is a wealth of options available for running stringtie, e.g. -c minimum reads per bp coverage (this still doesn't show the key transcript) among others, but which ones can I modify to detect the transcripts without ruining the output?

ADD REPLYlink modified 19 months ago • written 19 months ago by dec986210

the issue that started this post was that stringtie has to be run with the "-e" option, without which the important transcripts will not appear.

ADD REPLYlink written 19 months ago by dec986210

Maybe it does not come out as significant due to large variation between the replicates. You might also consider to switch tools, as ballgown does not seem to be maintained anymore. It also still uses FPKM (EDIT: uses FPKM by default, maybe one can also feed in other counts, I did never check), which is considered inferior for sample comparisons. As you probably do not need the transcript level abundances, why not using well-maintained gene level solutions like edgeR or DESeq2?

ADD REPLYlink modified 19 months ago • written 19 months ago by ATpoint30k

Ballgownn analysis accepts counts via meas

ADD REPLYlink modified 19 months ago • written 19 months ago by cpad011212k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1743 users visited in the last hour