**300**wrote:

Hi,

I've got bisulfite-sequencing data for two differentiation stages. The raw data was mapped using Bismark. For each CpG site, the methylation ratio was marked as "A/B" (A methylated reads vs B unmethylated reads for this site).

Now I want to compare the overall methyaltion of a certain region. Assuming this region contains 3 CpG sites, the methylated/unmethylated ratio of each site in the two stages are 45/60, 4/10, 5/15 and 14/65, 3/9, 7/20 respectively.

I think there are two ways to calculate the overall methylation rate of the region:

**Method 1: total amount of methylated reads / total number of reads within the region**

methyaltion rate of stage 1 = (45+4+5)/(45+60+4+10+5+15) = **54/139** = 0.39

methyaltion rate of stage 2 = (14+3+7)/(14+65+3+9+7+20) = **24/118** = 0.20

**Method 2: average the methylation rate of each CpG site within the region**

methyaltion rate of stage 1 = (45/(45+60) + 4/(4+10) + 5/(5+15))/3 = 0.32

methyaltion rate of stage 2 = (14/(14+65) + 3/(3+9) + 7/(7+20))/3 = 0.22

For **method 1**, I can then calculate the confidence inteval and the P value to know whether **54 out of 139** is significantly different from **24 out of 118**.

However, the methylation rate calculated by **method 1** is biased toward the site with higher reads coverage.

**Methold 2** seems to be more robust to indicate the atucal methylation rate of the region.

But I don't know which statistics test should I use for method 2 to know the significance.

Please help.

Thanks in advance.