Question: How do you normalize Transcript per Million TPM to compare between samples ?
3
gravatar for ZheFrench
2.5 years ago by
ZheFrench240
France
ZheFrench240 wrote:

UPDATE :

The question was initially "TPM Transcript per Million , gene / transcript length or both can be used ?"

I change the title...because I found answers to first questions by myself (so proud ^^) I saw there was a lot of views in a few time but with few answers...and I still have not the last word on the subject so I keep updating the post...think can be useful to others.

My post was initially about :

Can we calculate TPM directly from raw read count (from STAR output for example) ?

"Divide the read counts by the length of each gene in kilobases. This gives you reads per kilobase (RPK). Count up all the RPK values in a sample and divide this number by 1,000,000. This is your “per million” scaling factor. Divide the RPK values by the “per million” scaling factor. This gives you TPM."

Do you consider the total length of the gene or just the sum of the exon length ?

UPDATE : sum of exon length

I remind that sometimes I saw that transcript length was used...here but it's only when you align on transcriptome.. Is that true ?

For example, SALMON,KALLISTO give TPM values using speudo-alignment methods...

I don't know if it's correct to compute TPM from a genome alignment. "Transcript per Million" unit make more sense when you use transcriptome to (speudo)-align , no ?

Said differently, TPM values from speudoalignments (kallisto,salmon) can't be compare with the ones computed from an genome alignment. We need to know how the guy produced its TPM values before comparing.

UPDATE : Transcriptome looks nicer. Ok I'll use salmon and not try to re-calculate myself (by the way -g [ --geneMap ] arg will do the work fo me)

Ref :

https://groups.google.com/forum/#!topic/rsem-users/jJaeaSRG1eo

http://www.rna-seqblog.com/rpkm-fpkm-and-tpm-clearly-explained/

http://www.arrayserver.com/wiki/index.php?title=Omicsoft_RPKM/FPKM/Count_values

Calculating TPM from featureCounts output

https://gist.github.com/slowkow/c6ab0348747f86e2748b

​https://haroldpimentel.wordpress.com/2014/05/08/what-the-fpkm-a-review-rna-seq-expression-units/

UPDATE - TPM normilsation quest :How I do that properly without using Sleuth !? ::/

https://www.biostars.org/p/143458/#157303​

https://groups.google.com/forum/#!topic/sailfish-users/jBf9SGiH1AM

https://f1000research.com/articles/4-1521/v1

ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by ZheFrench240

When I was confused with the different quantification measures, I used to read documentations of this site :

https://haroldpimentel.wordpress.com/2014/05/08/what-the-fpkm-a-review-rna-seq-expression-units/

I hope it could help you a bit with your reflexion !

Woups just saw that you already read it... Sorry :(

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by Roxane Boyer950
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1878 users visited in the last hour