For comparisons with 0 replicates, I am using Kal's Z-test on proportions. I am wondering, does it make more sense to run Kal's on RPKM or on raw counts, and why? I would think that RPKM is better because it is normalized by gene length. We are more interested in knowing the (transcripts_of_geneA/total_transcripts) rather than (reads_of_geneA/total_reads), aren't we? I can see however that they can yield slightly different results in some cases.
I see conflicting information on each alternative. On the one hand, the Kal's paper writes the equation as (n-specific mRNA reads/cell)/(N-total mRNA reads/cell), but on the other hand, when I try to run the function on RPKM in CLC it warns me that proportions tests are aimed at count data.
Which is better?