TCGA RNAseqV2 upper quartile normalization with x1000 adjustment factor
Entering edit mode
5.7 years ago
CHANG ▴ 40

1. In this post, It says TCGA RNAseqV2  rsem.genes.normalized_results are calculated by "For gene level estimates you divide all "raw_count" values by the 75th percentile of the column (after removing zeros) and multiply that by 1000."

What are the reasons for multiplying by 1000?

2. To avoid problem with zero counts during log2 transformation, typically people +1 to read count.  Is this done before upper quartile normalization step ? I am thinking if we add 1 after normalization, it wouldn't make sense as some normalized read counts can be really small (i.e. 0.0001) , therefore a log2(0.0001) versus log2(1.0001) would be a huge difference.

Or Do people typically add 1 to just the (normalized) counts that are 0 before log2 transformation?



RNA-Seq • 2.8k views

Login before adding your answer.

Traffic: 2239 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6