I know the way MACS2 uses Poisson distribution to find the p value for the enrichment of a signal peak(observed) as compared to the local control signal(expected). My question is can we compare the p values of different experiments with different sequencing depth.

An experiment with higher sequencing depth will not just have high peak signal but will have high noise too and thus the p value might differ as I have demonstrated below by taking 2 cases.

Case 1) Observed = 40 , expected =30 , p value = 0.03230957

But letâ€™s increase the sequencing depth by 10x in case 2,then the observed and expected signals will also increase roughly by 10x, thus the p value will be much more significant.

Case 2) Observed =400, expected= 300, p value = 0.00000001639443

In my research work, I am in a need to compare different Chip-Seq experiment with different sequencing depth(varying from 100million to 10 billion sequencing depth). How do you think I can compare the signal enrichment between 2 chip-seq experiment with different sequencing depths ?

The p-values depend both on the depth and the local signal-to-noise ratio. I generally would not compare them. If you really want to make statements about differences between peaks then use a dedicated framework such as limma for a differential analysis. For example comparing treatment groups.

Can you elaborate what exactly you need, please give details.

I am interested in a meta analysis of two proteins: protein A and protein B . I am interested in finding their binding pattern in sites across the genome. I have collected their chipseq data in all types of cell lines and merged them, so its a meta analysis. The total seqdepth for protein A and protein B differs significantly - protein A has seq depth of 600 million, whereas protein B has seq depth of 2000 million.

I have noted that as I increase my seqdepth to around 500 million, even small slightly elevated noise peak gets significant p value of enrichment, for the same reason as I have noted in my question.

Unlike you said, I am not interested in any kind of treatment groups or context. I am interested in the general/collective overview of all contexts, thus I merge chipset data from all contexts.

Merging data is not a meta-analysis. Meta-analysis means that you for example take the stats from many groups and then apply a method that checks which genomic sites consistenly have high ranks in the individual analysis, RobustRankAggregation for example. Or it combines p-values (not discussing whether this makes sense here) to get a consensus p-value by methods such as Stout or Fisher.