Hi guys, I am planning to perform a pan-cancer gene expression analysis across several cancer types. However, I found that the TCGA data portal has been replaced by GDC. After carefully checking the harmonized data in GDC, I am now wondering which file I should use for gene expression analysis, FPKM or FPKM-UQ? What are the differences between the two file types? Previously, I used the files with suffix "rsem.genes.normalized_results" to perform the gene expression analysis. Is FPKM the same as the "*.rsem.genes.normalized_results" file? If so, when shall we use FPKM-UQ? Any help would be really appreciated. Thanks
This link might also be helpful:
RPKM (reads per kilobase per million mapped reads)
Upper Quantile (UQ)
See this link:
and this paragraph inside:
"Quantification of Ribo-seq and QTI-seq. Reads per kilobase per million reads (RPKM) value was calculated to quantify the ribosome occupancy of mRNA for CHX profiling (ref 20). A window centering the predicted TIS codon (−1, +4) was summarized to represent the abundance of translation initiation signal. To facili tate the comparison between different experimental conditions, we applied upper quartile (UQ) normalization to each predicted TIS codon on the basis of the population of total QTI-seq read count of each individual mRNA (ref 35). The fold changes of translational signal between two experimental conditions for both LTM and CHX profiling data were normalized by fold changes of RNA- seq FPKM values of the corresponding mRNAs".
"In statistics and the theory of probability, quantiles are cutpoints dividing the range of a probability distribution into contiguous intervals with equal probabilities, or dividing the observations in a sample in the same way. There is one less quantile than the number of groups created. Thus quartiles are the three cut points that will divide a dataset into four equal-size groups (cf. depicted example). Common quantiles have special names: for instance quartile, decile (creating 10 groups: see below for more). The groups created are termed halves, thirds, quarters, etc., though sometimes the terms for the quantile are used for the groups created, rather than for the cut points." WIKI