Whole Genome BiSulfite Sequencing and RRBS have 2bp coordiantes in output, how both claim to give single base pair resolution?
23 months ago

I have WGBS and RRBS data, where each coordinate in file represents 2 basepairs

in RRBS eachentry always has C and second postion, which means first position must be discarded and second be kept.

but in WGBS the 2bp seq can be anything, how should i know methylation level is for which exact base/position, as bisulphite sequencing is supposed to give single basepair resolution output.

here is a short sample:

WGBS file (Mus musculus C57BL/6 forebrain embryo (10.5 days), methylation state at CpG, mapping assembly mm10):

chr17   3000203 3000204 41  17 (TC)

chr17   3000204 3000205 67  3 (CG)

chr17   3000349 3000350 68  28 (GC)

chr17   3000350 3000351 74  27 (CG)

chr17   3000416 3000417 64  25 (GC)

chr17   3000417 3000418 62  39 (CG)

chr17   3000465 3000466 66  29 (TC)

chr17   3000466 3000467 61  38 (CG)

chr17   3000558 3000559 78  40 (CC)

chr17   3000559 3000560 90  40 (CG)

chr17   3002121 3002122 82  11 (GC)

chr17   3002122 3002123 80  25 (CG)

chr17   3002158 3002159 73  11 (CC)

chr17   3002159 3002160 84  25 (CG)

chr17   3002457 3002458 39  23 (AC)

chr17   3002458 3002459 43  23 (CG)

chr17   3002719 3002720 63  27 (TC)


RRBS file (RRBS on human T-47D, mapping assembly hg19):

chr1    10785   10786   99  76   (CC)

chr1    10788   10789   100 76 (GC)

chr1    10794   10795   100 76 (GC)

chr1    10810   10811   97  76 (GC)

chr1    10812   10813   97  76 (GC)

chr1    10815   10816   99  76(TC)

chr1    137976  137977  91  94 (CC)

chr1    137985  137986  89  94 (CC)

chr1    713375  713376  17  12 (CC)

chr1    731153  731154  100 2 (CC)

You made my day ! I had this thing in my mind, but for me, 0 or 1 based meant like strings, where if its one based, i will just shift forward one space, thats it. but the link you sent, tells an other story, where actually 2 positions are used to represent 1 base pair (strange)!

but thanks lot for your help.

Though the technical issue is solved ( i know which coordinate to keep, which to discard, and methylation level falls on exact which coordinate). but i am bit confused, Bisulfite sequencing must not give all Cs as output ? since methylation is just on C? here, it seems to be any of ATGC, esp in case of WGBS. Do you have idea?

23 months ago
ATpoint 50k

BED files are 0-based. The output indicates a single base, see Cheat Sheet For One-Based Vs Zero-Based Coordinate Systems