TCGA RNAseqV2 upper quartile normalization with x1000 adjustment factor
0
1
Entering edit mode
8.5 years ago
CHANG ▴ 40
  1. In this post, it says TCGA RNAseqV2 rsem.genes.normalized_results are calculated by "For gene level estimates you divide all "raw_count" values by the 75th percentile of the column (after removing zeros) and multiply that by 1000." What are the reasons for multiplying by 1000?
  2. To avoid problem with zero counts during log2 transformation, typically people +1 to read count. Is this done before upper quartile normalization step? I am thinking if we add 1 after normalization, it wouldn't make sense as some normalized read counts can be really small (i.e. 0.0001), therefore a log2(0.0001) versus log2(1.0001) would be a huge difference.

Or Do people typically add 1 to just the (normalized) counts that are 0 before log2 transformation?

RNA-Seq • 3.7k views
ADD COMMENT

Login before adding your answer.

Traffic: 3205 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6