Question

Gene level from transcript level expression estimate.

0

Entering edit mode

6.5 years ago

mforde84 ★ 1.4k

Would it be appropriate to simply sum FPKM/RPKM/TPM normalized data across transcripts to generate a gene level estimate? Each transcript will be abundant at a certain % of total expression, so the intergration should simply be the gene level expression of all associated transcripts. Am I correct or am I missing something?

RNA-Seq • 2.5k views

ADD COMMENT • link updated 6.4 years ago by eu.sanisa • 0 • written 6.5 years ago by mforde84 ★ 1.4k

0

Entering edit mode

I'd think the answer is no, depending on how 'gene level estimate' defined. Since transcripts associated with a gene can share exons, I'd think summing counts including overlapping transcripts from the same gene would lead to multiple counting. Or did you mean summing reads over all exons associated with a gene locus?

ADD REPLY • link 6.5 years ago by Ahill ★ 1.9k

score 0 · Answer 1 · 2017-11-24

If by transcripts you mean isoforms, then i think it is ok. I was wondering the same so i took a look at Trinity's align_and_estimate_abundance.pl . There is an option to give a "transcripts to gene map" and obtain a count matrix at the gene level. As I understand from the script, this uses either kallisto_trans_to_gene_results.pl or eXpress_trans_to_gene_results.pl . This last one has the statement: "compute it based on the formula: FPKM_gene = FPKM_isoA + FPKM_IsoB + ... "