edit: All the information bellow is correct, but unimportant. If all your tumor samples are from one source (or several, as I did not look carefully into the sites you cited), and all your normal samples are from another, you cannot calculate meaningful fold-changes between them. Mathematically you can, but you won't know if the differences are due to tumor / normal tissues (what you want), or differences in lab protocols, sample prep kits, sequencing technology, and so forth.
Besides, if you do not understand what sample is regarding the RNAseq experiments you want to analyse, you should read more papers (1) / tutorials (2, 3, 4, 5) / courses.
Sorry to be so blunt, hope it doesn't put you off.
See this answer for a formula, this link for a video (youngsters nowadays seems to prefer video than text), this repo for a python script, and here for a R script.
A very useful link for this kind of stuff is https://www.google.com
I found this equation.
TPM = FPKM / (sum of FPKM over all genes/transcripts) * 10^6
Is the sum of FPKM over all genes mean the sum of all FPKM values in one sample or across all samples?
TPM (as FPKM) is a within-sample normalization, so sum of FPKM from one sample.
edit: I think you got the formula wrong, shouldn't it be:
Sorry that was a poorly worded question.
I meant is the sum of RPKM over all genes mean the sum of all RPKM values from one transcript rather than all transcripts?
The text file I have has a bunch of these lines that show RPKM values for each transcript:
for example:
entrez_gene_id: ENSG00000169136 ensembl_gene_id hgnc_symbol: ATF5 transcript:NM_012068 transcript_length:2259 adipose: 2.016 colon: 1.685 heart: 2.113 hypothalmus: 1.057 kidney: 2.347 liver: 78.892 lung :1.5 ovary:1.948 skeletalmuscle: 1.093 spleen: 1.911 testes:2.485
Do I add all these numbers up for the sum of all RPKM values?
Imagine you have only one sample, and you have the FPKM values for each transcript from this sample. So the TPM for transcript
i
is:If you have several samples
s
, for each sample you sum the FPKM values from that particular sample, not from other samples or from all samples.Sorry I am new to this.
What exactly do you mean by sample?