Question: RNA Expression Transcript Level Collapse
3.8 years ago
arunnatrajanravi wrote:

Hi everyone,

I am sure this question was already raised. But, I didn't find it on any forums. It will be nice if someone could redirect me. My problem is: I have done RNA expression analysis with pyPRADA. This gives you a gct file with ensembl transcript ID, Hugo symbol and RPKM. For a same gene, it is quite normal to have multiple transcripts ID. But, I would like to collapse RPKMs of all transcript from the same gene so that I can get the expression level of that gene. Is it possible? Is there a way to do it? Is there a software to do it?

Name    Description    RPKM
ENST00000456328    DDX11L1    0.0
ENST00000515242    DDX11L1    0.0
ENST00000518655    DDX11L1    0.0
ENST00000450305    DDX11L1    0.0
ENST00000423562    RP11-34P13.2    0.05727542191743851
ENST00000438504    RP11-34P13.2    0.05361339449882507
ENST00000541675    RP11-34P13.2    0.049743443727493286
ENST00000488147    RP11-34P13.2    0.029792414978146553
ENST00000538476    RP11-34P13.2    0.025426123291254044
ENST00000537342    RP11-34P13.2    0.008295455016195774
ENST00000430492    RP11-34P13.2    0.03160959482192993

ENST00000473358    RP11-34P13.3    0.0
ENST00000469289    RP11-34P13.3    0.01880820095539093
ENST00000408384    hsa-mir-1302-2    0.07291585206985474
ENST00000417324    FAM138A    0.0
ENST00000461467    FAM138A    0.0

Thank you,

written 3.8 years ago by arunnatrajanravi
3.8 years ago
Amitm wrote:


I haven't used pyPRADA but there are many (widely used) programs for estimating RPKM values from aligned BAMs. You can start with Cufflinks ( or StringTie (

Both of them allow gene level estimate of RPKM (as well as transcript level) and are fairly easy to set-up.

You may also try count-based measures which just count the number of reads aligning within the given coordinates (from a BED file) of any gene. HTSeq ( and FeatureCounts ( Both of these can be used to get gene level summary of expression measures.

written 3.8 years ago by Amitm

Hi Amitm Thank you for your answer.

You mean StringTie gives RPKM? Because normally, Cufflinks gives FPKM. I am interested in RPKM not in count-based measures. And as I see, you have used a lot of softwares. An other question, should I normalize between samples when I use RPKM?

written 3.8 years ago by arunnatrajanravi
