Question: logCPM to RPKM
0
gravatar for gtasource
4 months ago by
gtasource20
gtasource20 wrote:

I have logCPM values I'm trying to convert to RPKM. Doing a search, I found you can convert it by doing the following:

RPKM = 2^(logCPM-log2(geneLength))

However, this is giving me negative RPKM values, as most of the gene lengths in log2 form are larger than the logCPM values (unless my math is incorrect.)

Any help would be appreciated.

rna-seq • 440 views
ADD COMMENTlink modified 4 months ago by Devon Ryan86k • written 4 months ago by gtasource20

Why would you want to go from one reasonably good data distribution (logCPM) to one pretty bad one (RPKM)? RPKM was the first normalisation method for single-end RNA-seq reads but it is not suitable for cross-sample differential expression.

ADD REPLYlink written 4 months ago by Kevin Blighe33k

Kevin, I completely agree with you. However, a collaborator I am working with has specifically asked for RPKM values. I do believe they also understand the limits of using it compared to logCPM, ect.

ADD REPLYlink written 4 months ago by gtasource20

I understand - I've been in those situations. Did you obtain your fomula from here: http://seqanswers.com/forums/showthread.php?t=59202 ?

I wlll 'nudge' Devon to see what he says.

ADD REPLYlink written 4 months ago by Kevin Blighe33k

I did! Thanks, I appreciate it. I should also note that these logCPM values are coming directly from edgeR!

ADD REPLYlink modified 4 months ago • written 4 months ago by gtasource20
4
gravatar for Devon Ryan
4 months ago by
Devon Ryan86k
Freiburg, Germany
Devon Ryan86k wrote:

It's not possible for you to receive negative RPKMs, since there exists no number such that 2 raised to it is less than 0. Subtracting the logs is the same as dividing counts by gene length and then taking the log of that. Yes, that can be negative, but you're then reversing the log with 2^.

BTW, the gene length should be in kilobases, in case you didn't already know that.

ADD COMMENTlink written 4 months ago by Devon Ryan86k
1

Thanks so much. I realized the data that was labeled as logCPM wasn't accurate. I went ahead and pulled the raw count tables, and calculated everything manually. Thanks for your help!

ADD REPLYlink written 4 months ago by gtasource20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1749 users visited in the last hour