Entering edit mode
7.6 years ago
kelly.wang135
▴
70
Hi, I am new to RNA-sequencing data and have one question regarding normalization.
I could download RNA-sequencing data in both read counts and rpkm. I thought I could use rpkm for eQTL analysis but little confused because there was this sentence on the following page of GTEx website.
"The RPKM values that are downloadable have not been normalized or corrected for any covariates." https://gtexportal.org/home/documentationPage#staticTextLabMethods
This means I should normalize rpkm or read counts before eQTL analysis?
Thanks for your help!
RPKM or FPKM or TPM are not used for any statistics. Period. They are normalized expression values more close to an absolute expression that is used for visualization. If you want to use anything for statistical purpose use the raw read counts and use the normalization methods that we usually perform for any RNA-Seq analysis.
This is not true, Cufflinks uses FPKM for differential expression testing. If FPKM is a good measure is another discussion - short answer is it is not.
That said, GTEx used read counts for the eQTL analysis, you can follow their protocol here.
I might not have read it but the link says the transcript quantification is done by Flux Simulator or Cufflinks. Cufflinks does not make a differential expression. It's the cuffdiff2. Cufflinks make the transcript quantification.