Question

How to summarize the expression of a gene when having expression data from different transcripts of that gene

2

Entering edit mode

17 months ago

jeni ▴ 90

Hi!

Imagine I have normalized expression values of two different transcripts from the same gene, but I am insterested in studying the expression at the gene level. Which is the best way to get this? Do you think it would be correct to calculate the mean of the expression levels from those two transcripts? What if a transcript is not expressed and other is?

Thanks!

rnaseq expression • 864 views

ADD COMMENT • link updated 17 months ago by i.sudbery 20k • written 17 months ago by jeni ▴ 90

score 0 · Answer 1 · 2023-02-08

0

Entering edit mode

17 months ago

ATpoint 84k

Typically gene-level summarization is done by summing the counts of the transcripts per gene. Check the tximport package over at Bioconductor.

ADD COMMENT • link 17 months ago by ATpoint 84k

0

Entering edit mode

tximport does something a bit more clever than simply summing up (it factors transcript length into consideration).

If you're given TPMs, you can simply sum those up.

ADD REPLY • link 17 months ago by dsull ★ 6.4k

0

Entering edit mode

tximport will calculate the weighted effective transcript length, but it doesn't use that in its calculation of gene level expression. Both counts and TPMs are just summed. The effective transcript lengths (which is sum(tx_expression*tx_length/gene_expression)) is then used as an offset in differential testing platforms.

ADD REPLY • link 17 months ago by i.sudbery 20k