RNA Expression Transcript Level Collapse
Entering edit mode
8.2 years ago

Hi everyone,

I am sure this question was already raised. But, I didn't find it on any forums. It will be nice if someone could redirect me. My problem is: I have done RNA expression analysis with pyPRADA. This gives you a gct file with ensembl transcript ID, Hugo symbol and RPKM. For a same gene, it is quite normal to have multiple transcripts ID. But, I would like to collapse RPKMs of all transcript from the same gene so that I can get the expression level of that gene. Is it possible? Is there a way to do it? Is there a software to do it?

169385    1
Name    Description    RPKM
ENST00000456328    DDX11L1    0.0
ENST00000515242    DDX11L1    0.0
ENST00000518655    DDX11L1    0.0
ENST00000450305    DDX11L1    0.0
ENST00000423562    RP11-34P13.2    0.05727542191743851
ENST00000438504    RP11-34P13.2    0.05361339449882507
ENST00000541675    RP11-34P13.2    0.049743443727493286
ENST00000488147    RP11-34P13.2    0.029792414978146553
ENST00000538476    RP11-34P13.2    0.025426123291254044
ENST00000537342    RP11-34P13.2    0.008295455016195774
ENST00000430492    RP11-34P13.2    0.03160959482192993
ENST00000473358    RP11-34P13.3    0.0
ENST00000469289    RP11-34P13.3    0.01880820095539093
ENST00000408384    hsa-mir-1302-2    0.07291585206985474
ENST00000417324    FAM138A    0.0
ENST00000461467    FAM138A    0.0

Thank you

RNA-Seq transcript • 2.9k views
Entering edit mode
8.2 years ago
Amitm ★ 2.2k


I haven't used pyPRADA but there are many (widely used) programs for estimating RPKM values from aligned BAMs. You can start with Cufflinks or StringTie.

Both of them allow gene level estimate of RPKM (as well as transcript level) and are fairly easy to set-up.

You may also try count-based measures which just count the number of reads aligning within the given coordinates (from a BED file) of any gene. HTSeq and FeatureCounts. Both of these can be used to get gene level summary of expression measures.

Entering edit mode

Hi Amitm Thank you for your answer.

You mean StringTie gives RPKM? Because normally, Cufflinks gives FPKM. I am interested in RPKM not in count-based measures. And as I see, you have used a lot of softwares. An other question, should I normalize between samples when I use RPKM?


Login before adding your answer.

Traffic: 2186 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6