Hi everyone,
I am sure this question was already raised. But, I didn't find it on any forums. It will be nice if someone could redirect me. My problem is: I have done RNA expression analysis with pyPRADA. This gives you a gct file with ensembl transcript ID, Hugo symbol and RPKM. For a same gene, it is quite normal to have multiple transcripts ID. But, I would like to collapse RPKMs of all transcript from the same gene so that I can get the expression level of that gene. Is it possible? Is there a way to do it? Is there a software to do it?
#1.2
169385    1
Name    Description    RPKM
ENST00000456328    DDX11L1    0.0
ENST00000515242    DDX11L1    0.0
ENST00000518655    DDX11L1    0.0
ENST00000450305    DDX11L1    0.0
ENST00000423562    RP11-34P13.2    0.05727542191743851
ENST00000438504    RP11-34P13.2    0.05361339449882507
ENST00000541675    RP11-34P13.2    0.049743443727493286
ENST00000488147    RP11-34P13.2    0.029792414978146553
ENST00000538476    RP11-34P13.2    0.025426123291254044
ENST00000537342    RP11-34P13.2    0.008295455016195774
ENST00000430492    RP11-34P13.2    0.03160959482192993
ENST00000473358    RP11-34P13.3    0.0
ENST00000469289    RP11-34P13.3    0.01880820095539093
ENST00000408384    hsa-mir-1302-2    0.07291585206985474
ENST00000417324    FAM138A    0.0
ENST00000461467    FAM138A    0.0
Thank you
Hi Amitm Thank you for your answer.
You mean StringTie gives RPKM? Because normally, Cufflinks gives FPKM. I am interested in RPKM not in count-based measures. And as I see, you have used a lot of softwares. An other question, should I normalize between samples when I use RPKM?