Question: To make difference or Ratio, it's a problem
0
gravatar for affaid_tyj
6 months ago by
affaid_tyj20
affaid_tyj20 wrote:

Hello everyone,

I am trying to give evidence for my regulation of gene A upon gene B in cDNA Chip (i.e. hgu133a) and RNA-seq (i.e. TCGA-RNA Hiseq) dataset. I already found a high correlation between the two gene's mRNA(say coeffient R for log2 transformed expresssion is at least 0.5 in all above dataset). My assumption is if A activated, then A/B ratio will be smaller across all samples within each dataset. Now the question is I did a ratio for chip data( probeset intensity) and it worked well for survival prediction, but for RPKM data, I only found the direct Δ between A and B readcounts predict well. So do I have reason to use Δ instead of ratio for RPKM data? Does anyone have relevant reference to recommand?

Thank you!

probeset rna-seq rpkm • 292 views
ADD COMMENTlink modified 6 weeks ago by Kevin Blighe30k • written 6 months ago by affaid_tyj20
0
gravatar for Kevin Blighe
6 weeks ago by
Kevin Blighe30k
Republic of Ireland
Kevin Blighe30k wrote:

Difficult to answer. All that I know is that RPKM data is ideal is not ideal - the normalisation method that produces RPKM expression values was one of the first forms of normalisation developed for RNA-seq but it has since been shown to be ineffective for cross-sample comparisons. Some have even questioned within-sample comparisons. With your HTseq counts, I would re-process these using DEseq2, EdgeR, or limma/voom.

Kevin

ADD COMMENTlink modified 16 days ago • written 6 weeks ago by Kevin Blighe30k
1

Well, I am thinking ratio might be a better way since normalizing RPKM or even TPM of gene A to gene B (both gene expression are obvious and the variations across samples are equal) may be able to give a more accurate evaluation of my prediction. Difference method seemed to be too crude in my question... I am using estimate count calculated from rsem-calculate-expression to calculate my ratio now and it worked well.

Thank you!

ADD REPLYlink written 28 days ago by affaid_tyj20

Did you mean 'not ideal'?

ADD REPLYlink written 6 weeks ago by russhh3.9k

lol - yes, you already know I am somewhat against RPKM. Will modify.

ADD REPLYlink written 6 weeks ago by Kevin Blighe30k

An update (6th October 2018):

You should abandon RPKM / FPKM. They are not ideal where cross-sample differential expression analysis is your aim; indeed, they render samples incomparable via differential expression analysis:

Please read this: A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis

The Total Count and RPKM [FPKM] normalization methods, both of which are still widely in use, are ineffective and should be definitively abandoned in the context of differential analysis.

Also, by Harold Pimental: What the FPKM? A review of RNA-Seq expression units

The first thing one should remember is that without between sample normalization (a topic for a later post), NONE of these units are comparable across experiments. This is a result of RNA-Seq being a relative measurement, not an absolute one.

ADD REPLYlink written 16 days ago by Kevin Blighe30k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2257 users visited in the last hour