Mapping WGBS probes to CGs (TCGA bed files)
2
0
Entering edit mode
8.4 years ago
sands3 • 0

I am currently working with WGBS methylation data from TCGA. It appears that these bed files have been generated using the tool BisSNP and contain the chromosome name, start coordinate, end coordinate, methylation value in percent, the coverage, the strand etc. According to the information of the package RnBeads, the coordinates are 0-based, spanning the first and the last coordinate in a site (i.e. end-start= 1 for a CpG). Sites on the negative strand are shifted by +1.

I tried to map the 2 coordinates per probe in the bed file to the genome and they do not seem to match CGs. If I shift all coordinates by +1, however, most of them (but not all) do match CGs. The problem is that the WGBS data that I have does not contain any information in the strand column, so I cannot shift coordinates depending on whether they are the positive or negative strand.

Has anyone faced the same problem?

TCGA WGBS methylation • 2.7k views
ADD COMMENT
0
Entering edit mode
8.4 years ago
Tej Sowpati ▴ 250

It depends on the method by which you are retrieving the corresponding sequence - whether the genome is 0-based or not. However, in your case, it looks like using a +1 is the correct approach. Do all of them match a cytosine when you shift them by +1? Because Bisulfite sequencing can identify methylation in non CpG context too..

ADD COMMENT
0
Entering edit mode
8.4 years ago
sands3 • 0

Yes, it seems to me as well that shifting by +1 is the right approach. For only a few isolated cases it does not match a C, but another nucleotide instead (in these cases there are no Cs in the surroundings, so shifting +1 or not does not make a difference).

ADD COMMENT

Login before adding your answer.

Traffic: 2144 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6