I have RRBS fastq files. I used Bismark to perform methylation call. After methylation call I got M-bias plot shown below. The methylation rate of first three bases of 5 prime end is quite high. The actual methylation count and rate of first four position is shown below.
My questions are:
- Is the observed high methylation rate is because of end repair biases?
- In the literature It has been mentioned that it is common to have high methylation rate in 5' end, but how much is too much?
- First three bases of RRBS reads are either CGG or TGG depending on their methylation state. Is it good idea to chop off first 3 bases ? If yes, doesn't the removal of C (that retains original genomic methylation state) influence downstream analysis?
CpG context =========== position count methylated count unmethylated % methylation coverage 1 5000734 2489532 66.76 7490266 2 430 206 67.61 636 3 190 131 59.19 321 4 34174 79253 30.13 113427