Question: To make difference or Ratio, it's a problem
gravatar for Yijun Tian
14 months ago by
Yijun Tian20
Medical College of Wisconsin
Yijun Tian20 wrote:

Hello everyone,

I am trying to give evidence for my regulation of gene A upon gene B in cDNA Chip (i.e. hgu133a) and RNA-seq (i.e. TCGA-RNA Hiseq) dataset. I already found a high correlation between the two gene's mRNA(say coeffient R for log2 transformed expresssion is at least 0.5 in all above dataset). My assumption is if A activated, then A/B ratio will be smaller across all samples within each dataset. Now the question is I did a ratio for chip data( probeset intensity) and it worked well for survival prediction, but for RPKM data, I only found the direct Δ between A and B readcounts predict well. So do I have reason to use Δ instead of ratio for RPKM data? Does anyone have relevant reference to recommand?

Thank you!

probeset rna-seq rpkm • 498 views
ADD COMMENTlink modified 9 months ago by Kevin Blighe43k • written 14 months ago by Yijun Tian20
gravatar for Kevin Blighe
9 months ago by
Kevin Blighe43k
Republic of Ireland
Kevin Blighe43k wrote:

Difficult to answer. All that I know is that RPKM data is ideal is not ideal - the normalisation method that produces RPKM expression values was one of the first forms of normalisation developed for RNA-seq but it has since been shown to be ineffective for cross-sample comparisons. Some have even questioned within-sample comparisons. With your HTseq counts, I would re-process these using DEseq2, EdgeR, or limma/voom.


ADD COMMENTlink modified 8 months ago • written 9 months ago by Kevin Blighe43k

Well, I am thinking ratio might be a better way since normalizing RPKM or even TPM of gene A to gene B (both gene expression are obvious and the variations across samples are equal) may be able to give a more accurate evaluation of my prediction. Difference method seemed to be too crude in my question... I am using estimate count calculated from rsem-calculate-expression to calculate my ratio now and it worked well.

Thank you!

ADD REPLYlink written 8 months ago by Yijun Tian20

Did you mean 'not ideal'?

ADD REPLYlink written 9 months ago by russhh4.4k

lol - yes, you already know I am somewhat against RPKM. Will modify.

ADD REPLYlink written 9 months ago by Kevin Blighe43k

An update (6th October 2018):

You should abandon RPKM / FPKM. They are not ideal where cross-sample differential expression analysis is your aim; indeed, they render samples incomparable via differential expression analysis:

Please read this: A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis

The Total Count and RPKM [FPKM] normalization methods, both of which are still widely in use, are ineffective and should be definitively abandoned in the context of differential analysis.

Also, by Harold Pimental: What the FPKM? A review of RNA-Seq expression units

The first thing one should remember is that without between sample normalization (a topic for a later post), NONE of these units are comparable across experiments. This is a result of RNA-Seq being a relative measurement, not an absolute one.

ADD REPLYlink written 8 months ago by Kevin Blighe43k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 630 users visited in the last hour