I am trying to analyze human WGBS data using Bismark and methylkit package. I found some discrepancy in %methylation and trying to understand the results.
In Bismark report ~75% methylation is seen in CpG context as shown in below graph
But when I look at % CpG methylation per base using methylkit package getMethylationStats(), I see very low methylation levels as seen in below graph.
Can someone help me understand why I see such discrepancy and how to interpret the results? If there are high levels of methylated Cs, shouldn't we expect more methylation at CpG base. Am I interpreting the graph the right way? Is it an issue with methylKit because the total number of nucleotides calculated is not close to the same. Total C's analyzed in Bismark is 1e+e10, whereas the highest number is on the 6e+08 scale for methylkit.
Thanks, Pooja
I think the plots show two different things. Bismark looks at all the methylation calls, irrespective of their actual location in the genome. I think (and this you should confirm by reading methylkit's manual) that methylkit summarizes the calls per CpG locus, i.e. if you sequence the same locus multiple times, Bismark would reflect the individual calls separately while methylkit will collapse them per locus.