Question: Normalising RPKMs from ribo zero and smarter kit
0
gravatar for ChIP
4.6 years ago by
ChIP490
Netherlands
ChIP490 wrote:

Hi All,

 

I am not sure, how many of you have faced this problem. I have a RPKM from a RNA-seq run that was performed using ribo-zero method and I have another set of RPKMs from RNA-seq run that was performed using smarter kit.

Now due to difference in method a direct comparison is not of utility, so how can I normalise or scale these values to compare them?

For one of the files I only have RPKMs and not the read counts, so it is  bit difficult.

Any ideas?

 

Thank you

 

rna-seq rpkm • 1.5k views
ADD COMMENTlink modified 4.6 years ago by geek_y9.6k • written 4.6 years ago by ChIP490
1

I think Yes,

there would be batch effect and there are RPKM variability issues as discussed several times on this forum. Better to ignore RPKM and start working with read counts per million. you can go for quantile normalization to remove batch effect arising from different RNA-seq source.

ADD REPLYlink written 4.6 years ago by Manvendra Singh2.0k

For one of the samples I only have RPKMs and not the tags, I have updated this information in my question. Can suggestions now?
 

ADD REPLYlink written 4.6 years ago by ChIP490

Am I correct in guessing that the thing you're interested in measuring is partitioned across the method batch-effect? If so, you might look into RUV-2, since you'll need to use control genes for normalization.

ADD REPLYlink written 4.6 years ago by Devon Ryan90k

Yes, it is a difference in sample prep method for RNA-seq, what you are suggesting looks promising but it is for microarracy? isn't it?

 

ADD REPLYlink written 4.6 years ago by ChIP490
2

The same method applies. They have a later paper that describes the method in an RNAseq context.

ADD REPLYlink written 4.6 years ago by Devon Ryan90k
1

you finally play with numbers which are expression levels (could be normalized by some endogenous control), so doesn't matter whether it comes from RNA-seq or microarray.

ADD REPLYlink written 4.6 years ago by Manvendra Singh2.0k
1

While this is true, it should be noted that the values derived from RNAseq aren't independent of each other (e.g, an increase in signal from gene A will lead to an apparent decrease in signal from gene B), which can affect how well some methods work. This is also part of the reason why RPKM stinks as a metric.

ADD REPLYlink written 4.6 years ago by Devon Ryan90k
0
gravatar for geek_y
4.6 years ago by
geek_y9.6k
Barcelona/CRG/London/Imperial
geek_y9.6k wrote:

Convert the rpm back to read counts using the formula RPKM = (10^9 * C)/(N * L) where N is the total number of mapped reads, C is the total read counts per feature(gene/exon), and L is the length of the feature. A simple perl/Python script or AWK will do that.  One you have raw read counts, you can play around with different normalisation methods.

ADD COMMENTlink written 4.6 years ago by geek_y9.6k
1

this is not advisable for two reasons:

  1. If you realy only have RPKM you can't get the value of N
  2. If the rpkms are from cuff links this formula doesn't apply because it's using some additional magic
ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by Michael Dondrup46k

We may need to run cufflinks with --no-effective-length-correction option. That effective length is something we do not know.

ADD REPLYlink written 4.6 years ago by geek_y9.6k
1

We also don't know what sort of fractional counts are being used, which will muck up the negative-binomial based methods if ChIP wanted to use them. Since ChIP mentioned not having raw data for one of the datasets, then we're stuck thinking in purely RPKM (with all of the problems that that entails).

ADD REPLYlink written 4.6 years ago by Devon Ryan90k

Hi!

I do not have read counts, that is the problem. Otherwise the method you (Geek_y) are suggetsing was of utility. Secondly, I have these values from mmseq (log mu) and then I have taken antilog of the same with base e.

 

ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by ChIP490
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2250 users visited in the last hour