Question: Transcript vs. Gene TPM counts?
gravatar for mike-zx
9 months ago by
mike-zx210 wrote:

Working with some GTEx portal data right now and I've noticed that at GTEx's downloads page there are both a "Gene TPMs" and a "Transcript TPMs" file. My question is how exactly do these files differ from each other in terms of the steps for obtaining such files? I guess another way to phrase it would be why are there two files like this if RNA-Seq is supposed to output reads for transcripts in general? I would expect only 1 file with all the transcripts from GTEx instead of one that makes a distinction of gene vs. transcript... I'm obviously missing out on something rather elemental here but I don't know what it is.

Another minor question is whether or not if it is safe to assume that data from these files is normalized. As I understand, the data being TPMs implies the read counts have been normalized in the process of converting to the TPMs themselves, but I'm not 100% sure about this Thanks for any help.

gtex rna-seq • 544 views
ADD COMMENTlink modified 9 months ago by kristoffer.vittingseerup3.3k • written 9 months ago by mike-zx210
gravatar for kristoffer.vittingseerup
9 months ago by
European Union
kristoffer.vittingseerup3.3k wrote:

1) Gene expression is obtained by summing the expression of all transcripts belonging to the same gene. 2) Yes TPM are normalised values - but you might still need to perform a inter-library normalization. You can read more about that here.

ADD COMMENTlink written 9 months ago by kristoffer.vittingseerup3.3k
gravatar for MatthewP
9 months ago by
MatthewP660 wrote:

Hey, RNA-seq can output both gene and transcript read counts. TPM is normalized data, yes.

ADD COMMENTlink written 9 months ago by MatthewP660

what is exactly being measured in "gene counts" though?

ADD REPLYlink written 9 months ago by mike-zx210

Mike, I am very sorry if I am being pedantic and what I say below is too simplistic.

Here, the word "transcript" does not mean the mRNA product of the gene. The "Gene" and the "Transcript" are those defined in the gene definition file (gtf, or gff/gff3). For example, Hoxa1 gene in human has two transcripts according to ensembl. So if GTEx has used ensembl gene definition the "transcript TPM" file will have two values while the "gene TPM" file will have only one value.

ADD REPLYlink written 8 months ago by vj430

Not pedantic at all, this actually makes a lot of sense. Thank you

ADD REPLYlink written 7 months ago by mike-zx210

This is super helpful and exactly the answer I was looking for, thank you!!

ADD REPLYlink written 4 weeks ago by Danielle B10

Reads mapped to the gene.

ADD REPLYlink written 9 months ago by shoujun.gu310
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1628 users visited in the last hour