How to interpret "pseudo count" in gene expression data handling context?
1
4
Entering edit mode
4.0 years ago
n,n ▴ 360

I am reading a paper that has a passage describing the pre-processing of gene expression data before conducting the experiments. The passage states "After conversion to a base-2 logarithm with a pseudo count of 0.125, batch normalization using ComBat was applied".

What exactly is a pseudo count? What I understood initially was that you add 0.125 to every value in your gene expression matrix and then take the logarithm of that to avoid taking the logarithm of 0 (which is not defined). This is based on my intuition though and I would like to know if this is correct and if there are other reasons why pseudo counts are used.

RNA-Seq normalization • 6.1k views
ADD COMMENT
6
Entering edit mode
4.0 years ago
dsull ★ 5.8k

Your understanding is correct.

I personally like log2(x+1) because a pseudocount of 1 means you don't have to deal with negative numbers.

ADD COMMENT

Login before adding your answer.

Traffic: 2128 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6