I have logCPM values I'm trying to convert to RPKM. Doing a search, I found you can convert it by doing the following:
RPKM = 2^(logCPM-log2(geneLength))
However, this is giving me negative RPKM values, as most of the gene lengths in log2 form are larger than the logCPM values (unless my math is incorrect.)
Any help would be appreciated.
Why would you want to go from one reasonably good data distribution (logCPM) to one pretty bad one (RPKM)? RPKM was the first normalisation method for single-end RNA-seq reads but it is not suitable for cross-sample differential expression.
Kevin, I completely agree with you. However, a collaborator I am working with has specifically asked for RPKM values. I do believe they also understand the limits of using it compared to logCPM, ect.
I understand - I've been in those situations. Did you obtain your fomula from here: http://seqanswers.com/forums/showthread.php?t=59202 ?
I wlll 'nudge' Devon to see what he says.
I did! Thanks, I appreciate it. I should also note that these logCPM values are coming directly from edgeR!