How to compute log2ratio of IP/Input when Input has zero reads?
0
0
Entering edit mode
6.0 years ago
Zee_S ▴ 60

Dear Biostars community,

I have assigned the reads in my IP.bam and Input.bam to gene coordinates of interest provided in a bed file. I have done this using bedtools coverageBED.

Now, I have the normalized read counts per specified gene bin for IP and Input. I would like to extract those genes which are enriched in my IP by calculating a log2ratio of reads_in_IP/reads_in_Input.

But the problem is that some gene coordinates (or gene bins) have zero reads assigned in the input, which makes it not possible to calculate a log2 ratio.

if I add a pseudo count of +1, this will significantly change the value of my log2ratios. then, if I were to use a threshold log2ratio = 1.5 to filter IP targeted genes, it will also extract "fake" genes that are only being selected because of the pseudo count.

for example, if for one gene bin, IP=7 reads and Input=0 reads, if I add pseudo count of 1 to both, I get, IP=8 reads, and Input=1 read. then log2ratio=3. given a threshold for enrichment as log2ratio=1.5, this bin would be selected as targeted in IP, when actually it might not be!

How can I correct for this????

I realize i will encounter this problem even if I use another method to assign the reads (such as bedinteresect, BEDOPS or samtools)

Can someone please advice me how to solve this?

Many thanks for your help.

normalization log2ratio bedtools coveragebed Input • 1.7k views
ADD COMMENT
0
Entering edit mode

This problem has been around since the days of microarrays. Consider these threads for inspiration.
A: Statistical question (Deseq, Cuffdis) when one condition is zero?
Dealing With Null Expression Values

ADD REPLY
0
Entering edit mode

From a purely mathematical point of view, you could approximate the log with the inverse hyperbolic sine (asinh) function e.g.

pseudoLog2 <- function(x) { asinh(x)/log(2) }
ADD REPLY

Login before adding your answer.

Traffic: 1467 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6