Question: Any reason for higher reported levels of non-CG methylation when analysing non-directional WGBS libraries with Bismark?
1
gravatar for benbio
9 months ago by
benbio10
benbio10 wrote:

Hi, just FYI I've also posted a variation of this question on seqanswers, I'll update both posts accordingly with any responses.


I've been analysing some WGBS libraries for a couple of different insect species, using Bismark for alignment. For context in most insects methylation levels are low to begin with and non-CG methylation is often considered to be noise.

I have both directional and non-directional libraries for each species, with alignment performed using the non-directional parameter in Bismark for the latter. I've noticed a pattern of non-directional libraries having higher levels of non-CG methylation (as estimated by Bismark's splitting report generated during methylation extraction). Examples as follows from this report - same species, both control groups:

Directional:

Final Cytosine Methylation Report
=================================
Total number of C's analysed:   1441325271

Total methylated C's in CpG context:    10947321
Total methylated C's in CHG context:    1518883
Total methylated C's in CHH context:    7907753

Total C to T conversions in CpG context:    260501288
Total C to T conversions in CHG context:    203098923
Total C to T conversions in CHH context:    957351103

C methylated in CpG context:    4.0%
C methylated in CHG context:    0.7%
C methylated in CHH context:    0.8%

Non-directional:

Final Cytosine Methylation Report
=================================
Total number of C's analysed:   1671579979

Total methylated C's in CpG context:    15071917
Total methylated C's in CHG context:    3393242
Total methylated C's in CHH context:    17650398

Total C to T conversions in CpG context:    360463445
Total C to T conversions in CHG context:    264082345
Total C to T conversions in CHH context:    1010918632

C methylated in CpG context:    4.0%
C methylated in CHG context:    1.3%
C methylated in CHH context:    1.7%

This pattern holds in all samples and in both species I'm studying. Before I conclude that this is the result of quirks of these specific datasets, I was wondering if anyone was aware of a reason to do with the nature of non-directional libraries as to why the non-CG methylation might be reported at a higher level?

ADD COMMENTlink written 9 months ago by benbio10
1

It could be:

  1. Difference in bisulfite conversion efficiency
  2. Overall error rate differences in these seq runs - if you could obtain the phi-x from those runs you could check the error rates precisely
  3. Something related to the alignment search space being twice as large for non-directional libraries. Perhaps try to replicate the result with another aligner like Biscuit
ADD REPLYlink written 9 months ago by mark.ziemann1.3k

Thanks for your suggestions, Mark. Point 3 is the kind of thing I was thinking might be the reason, as CpG context is unaffected, reported at similar levels in both directional and non-directional libraries - I assume such issues as 1 and 2 would affect all contexts. I'll try your suggestion of using alternative aligners - thanks again.

ADD REPLYlink written 9 months ago by benbio10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1404 users visited in the last hour