Question

ENOCDE peaks files p-value

0

Entering edit mode

8.8 years ago

atsalaki ▴ 20

I downloaded peak files from ENCODE Chip-seq matrix of cell line MCF7 and TF CTCF. they have a column q-value that is the FDR, i assumed ..... Is it correct to calculate the p-value missing column values(all values are -1 because they are not provided )with this approach or it has no meaning for the identification of the binding sites of this TF? To calculate the p-value, firstly, we need to know the estimate of pi0 used to create the q-value and that's the maximum q-value or very, very close to it: pi0 = max (qvalues), then we compute m0, the estimated number of true nulls: m0 = length (qvalues) * pi0, we multiply each q-value by the proportion of true nulls expected to be under it (the inverse of how we get there from the p-value): qvalues * rank(qvalues) / m0. So the p-value for all the peak files downloaded from the ENCODE ChIP-seq Experiment matrix was calculated using the type below: p-value = q-value * rank(q-value) / (max(q-value) * length(q-value)) and then was normalized to (0-1) by applying the log() mathematical function because it was initially calculated using a –log10 scale function from the ENCODE: p-value = 1/(10^log(p-value)) I try to understand the above approach but i am having difficulties. Can someone give me a clue? thanks in advance.

ChIP-Seq p-value q-value • 2.1k views

ADD COMMENT • link 8.8 years ago by atsalaki ▴ 20