Question: Normalization in RNA-seq
gravatar for Xin
5.0 years ago by
Xin60 wrote:

Hiii dear friends.. 

I used TopHat and CUfflinks suite applications to find out differential expression. 

I just followed the QC -> TopHat -> Cufflinks -> Cuffmerge -> Cuffdiff -> Cummerbund

Then my boss asked me: " Did you do normalization?! "; "Did you use any t-test or Baysian or other tests to normalize the data?" I said No. 

What should I normalize and why?? 

If any Cufflinks tools does normalization, what kind of test it uses?

rna-seq • 2.0k views
ADD COMMENTlink modified 5.0 years ago by kanwarjag1.1k • written 5.0 years ago by Xin60

If you don't know the answer to "what should I normalize and why?" then you shouldn't be doing the analysis.

Also, I hope your boss didn't ask if you used a T test to normalize data. That'd be nonsensical.

ADD REPLYlink written 5.0 years ago by Devon Ryan97k

The question he should have asked is "Did you read the CuffDiff paper or just looked at the pipeline diagram?!"

ADD REPLYlink written 5.0 years ago by Asaf8.4k

Or perhaps the PI should also read the widely used cuffdiff paper.

ADD REPLYlink written 5.0 years ago by Chirag Nepal2.2k

I cannot resist my comments on Istvan's views. There are increasing number of investigators who are getting confused with RNA seq analysis. I agree there are still lot of challenges in RNA-seq analysis but with a load of commercial companies which claim that their tool can do everything is creating more chaos. We have lot of problems I when a blind analysis will be carried out and then lab will invest lot of resources and come back with a statement these results looks weird or not working. Sometime if one sample is creating trouble the blind analysis will cause lots of problem.I think days are not too far when some of the studies may deem to be analyzed again. Having said all this I think we have to be reasonably accommodating with some simple questions in regard to RNAseq and encourage learners on this forum to go back and teach their PIs some good science.

ADD REPLYlink modified 11 months ago by RamRS30k • written 5.0 years ago by kanwarjag1.1k

This is a good point - we just need to make sure to articulate this correctly.

RNA-Seq is a field that is ripe with contradictions and lofty claims - time and again I am confused by what the RNA-Seq data shows, though we've ran hundreds of analyses. Some analyses work with not effort whatsoever - others are a jumbled mess, and the results are a mess, and the tool should recognize it and warn about it but they do not. They produce them p values like no ones' business. In each case by that time the experiment is over, tens of thousands of dollars and years of man power have been put into it. It is too simplistic to say - well you should not have done this or that or learned more about of this or that before you even started the work. But once you dig into this deeper it is less clear where responsibilities lie.

Bioinformatics publications always make a tool sound a lot better than what it actually is. Are biologists actually responsible of deciphering what percent of that paper is actually valid?

The real change in bioinformatics needs to come from us - where we devise and publish software and protocols that actually work reliably - not just kind a work if one jumps through all kinds of hoops. The first step of that change comes from us, when we say yeah it is awful that a tool can give you nonsensical results and is not your fault that it does not work as it claims to.

ADD REPLYlink modified 11 months ago by RamRS30k • written 5.0 years ago by Istvan Albert ♦♦ 85k


You may want to have look at this video, to understand RNA-Seq normalisation and differential expression.

ADD REPLYlink modified 5.0 years ago • written 5.0 years ago by Thibault D.690

To help yourself and the BioStars community, you should show us first what you found on pubmed/google/seqanswers/biostars/etc., searches about your question. Then you'll come up with "real" questions, e.g. why using RPKMs/TPMs/etc. Cheers.

ADD REPLYlink modified 5.0 years ago • written 5.0 years ago by Israel Barrantes790
gravatar for Istvan Albert
5.0 years ago by
Istvan Albert ♦♦ 85k
University Park, USA
Istvan Albert ♦♦ 85k wrote:

I'd like to urge everyone to be nicer and gentler to newcomers. Be supportive and a little more generous. You have managed to hurt the feelings of someone that came here for help.

RNA-Seq is a very complicated topic. The concept of normalization has been changed and reworked many times over. Most of the concepts and explanations that one would find via google searches are only partially right and some have been proven to be incorrect.

As for the original poster: if you used the TopHat pipeline then it has automatically applied a whole slew of normalizations and statistical tests that the original authors thought were appropriate.

ADD COMMENTlink modified 11 months ago by RamRS30k • written 5.0 years ago by Istvan Albert ♦♦ 85k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1819 users visited in the last hour