TCGA gene models and exon/isoform level expression data
Entering edit mode
22 months ago
dolevrahat ▴ 30


I have downloaded mRNA expression data from TCGA-BLCA using TCGAbiolinks. I noticed that the expression quantification (FPKM/counts) are given at the gene level.

I have two questions it this respect:

  1. How can I tell exactly which exons are included in the gene models whose expression levels are reported in the data? I had a look at the reference GTF file from the GDC GDC.h38 GENCODE v22 GTF and noticed that included multiple transcripts per gene.

  2. Is it possible to get the expression levels on the exon or transcript level? I understand from this post that the exon counts are available from the TCGA legacy data portal and indeed this is still possible, but I was wondering if it is also possible to get it from the GDC?

Thanks in advance!

RNA-Seq TCGAbiolinks gene models TCGA • 501 views

