Question: Help with the Nature Protocol
2
gravatar for iwanicka1
3.1 years ago by
iwanicka120
iwanicka120 wrote:

Hi all,

Last week I was trying to get trhough Nature Protocol: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. https://www.ncbi.nlm.nih.gov/pubmed/27560171 . I downloaded the required softwares and tried to follow it with the data from the protocol. However when I got to the Ballgown part in R software I noticed that any gene was differentially expressed. The highets " q value" that I got when I sorted them it was 0.5 or 0.6. I didnt get any "q value" less than 0.05 . Looked up the steps before (hisat and stringtie) and it is ok... Did someone have already tried this protocol with data provided by Nature and got the same results ???

Thank you

rna-seq • 1.7k views
ADD COMMENTlink written 3.1 years ago by iwanicka120

Did you use your own data, or a public dataset?

ADD REPLYlink written 3.1 years ago by Michael Dondrup47k

I used the same data as stated in that nature protocol ( data od chromosome X of Homo sapiens) available in ftp://ftp.ccb.jhu.edu/pub/RNA_Seq_protocol/chrX_data.tar.gz

ADD REPLYlink written 3.1 years ago by iwanicka120

Ballgown requires 4 biological replicates (mentioned in the protocol). How many did you try? When I tried with 3, I got into same issues. For less than 4 replicates, they recommend other methods.

ADD REPLYlink written 3.1 years ago by Satyajeet Khare1.6k

There are only 3 biological replicates on that protocol. Did get to the end of it with the same results reported ?? ( with only 9 transcripts differentially espressed ?)

ADD REPLYlink written 3.1 years ago by iwanicka120
2

You are right. There are only 3 replicates. However, here is what they say for n < 4.

Note that Ballgown’s statistical test is a standard linear model based comparison. For small sample sizes (n < 4 per group) it is often better to perform regularization. This can be done using the limma33 package in Bioconductor. Other regularized methods such as DESeq23 and edgeR20 can be applied to gene or exon counts, but are not appropriate for direct application to FPKM abundance estimates. The statistical test uses a cumulative upper quartile normalization34.

For less than 4 replicates they recommend regularization using limma. Did you try that?

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Satyajeet Khare1.6k

I didn't tried. I should try ! Thanks for the tip

ADD REPLYlink written 3.1 years ago by iwanicka120
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1351 users visited in the last hour