i have a bed file containing methylation levels at certain coordinates which has been generated from BS-Seq data of human spleen cells. Here is a small part of its content:
chr1 10468 id-20250951 0.773585
chr1 10469 id-20250952 0.773585
chr1 10470 id-20250953 0.750000
chr1 10471 id-20250954 0.750000
chr1 10483 id-20250955 0.918033
chr1 10484 id-20250956 0.918033
chr1 10488 id-20250957 0.830769
chr1 10489 id-20250958 0.830769
chr1 10492 id-20250959 0.805556
chr1 10493 id-20250960 0.805556
chr1 10496 id-20250961 0.896104
chr1 10497 id-20250962 0.896104
I need to calculate the methylation levels within 20 nucleotide bins along a certain part of the genome. Lets consider the first six entries (coordinate 1068 - 1084) for our first 20 nt bin: How is the methylation level defined? Do i have to sum up the first 6 methylation values and devide by 20 or by 6?
I also heard from a friend, that it might be defined as the sum of the first six methylation values divided by the total number of Cytosines within the 20 nt bin.
Can someone please shed light on this?