Question: Clarification For Rna-Seq Normalization
gravatar for Assa Yeroslaviz
8.1 years ago by
Assa Yeroslaviz1.2k
Assa Yeroslaviz1.2k wrote:

Hi everybody,

I read a lot in the last few days about the different opinions to rna-seq normalization methods. To be honest I'm quite a bit confused at the moment and so I would like to ask for your help to try and clarify me about how to use what kind of normalization method.

I'm sure that there is no straightforward answer for such a question but I would really appreciate contradictory opinions if it will help for other users also to explain the problem.

As far as I understand it there is no "standard" method for normalizing methods.

We have one rna-seq experiment with each only one set for control and one set for treatment. Albeit the fact of insignificance regarding the lack of replicates, I would like to understand how to work in general with rna-seq data.

we would like to look into both differential expression and differences in splice variants between the two conditions. I have read opinion about how to normalize the data in best way for identifying differentially expressed genes and for identifying isoforms. Apparently these two goals should be analysed differently. The best example for that was the discussion between Simon and lpachter about when to normalize how here:

I think it shows how controverse this can be. I was interested in this discussion, though it is quite an old one and a lot have changed probably.

RPKM measure the relative level of gene expression between experiments, but appearently some people are against it, due to certain biases, which it can't compensate. In the posting above, Simon mentions DESeq (EdgeR), which suppose to work better for differential expression

So my questions are:

  1. Will it be better to normalize the data twice separately for the two goals

  2. Does it make sense to normalize data one time after the other?

  3. Can I relay on cuffdiff/cuffcompare to give me a good estimation on the splice variants and on DESeq/SDEGSeq to give me a good estimation about the differentially expressed genes?

I would appreciate every comment or discussion.

Thanks A.

data rna • 3.5k views
ADD COMMENTlink written 8.1 years ago by Assa Yeroslaviz1.2k

Could you clarify your second question a bit? What do you mean by "one time after the other?" Do you mean to ask whether it makes sense to apply tow normalization methods sequentially?

ADD REPLYlink written 8.1 years ago by Chris Evelo10.0k

yes exactly. I know it from earlier microarrays experiments, that doing two sequent normalizations will shift the values more. It can be good, but not necessarily. So is it a good idea to run here two sequent normalization procedures, or is it better to run the two analyses completely separate from each other?

ADD REPLYlink written 8.1 years ago by Assa Yeroslaviz1.2k
gravatar for Ido Tamir
8.1 years ago by
Ido Tamir5.0k
Ido Tamir5.0k wrote:

On the question of how to combine different normalization methods I cant give you an answer beside that you probably will violate the input assumptions of the second method.

The discussions on normalization has moved on a lot and it has been shown cqn that you could have e.g. sample specific (not gene specific!) GC-bias that you could correct for and for which RPKM or global scaling is not enough. cqn also discusses briefly and gives reference to other normalization methods.

So first check if you find biases e.g. GC/RPKM (sample specific different) etc... in your data and then decide if you need to apply normalizations.

And you need biological replicates.

ADD COMMENTlink written 8.1 years ago by Ido Tamir5.0k

yes I know abut the replicates, but that what i have to work with at the moment.

ADD REPLYlink written 8.1 years ago by Assa Yeroslaviz1.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 890 users visited in the last hour