Question: Annotation of Methylation data in GDC Portal
0
gravatar for noorpratap.singh
2.3 years ago by
University of Maryland
noorpratap.singh300 wrote:

I was specifically looking at rowData of the SummarizedExperiment object that is obtained after downloading from TCGABiolinks package. So in this data.frame, a particular column called 'Feature type' exists. It contains information about S_Shore, N_Shore, CGI, N_Shelf, S_Shelf. However I also see a lot of "." (dots) in this column. Does it imply that belong to Open Sea, since all other categories exist or they care unknown?

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by noorpratap.singh300

They appear to be sites that fall outside of the following classification:

The position of the CpG site in reference to the island:

 - Island
 - N_Shore or S_Shore (0-2 kb upstream or downstream from CGI)
 - N_Shelf or S_Shelf (2-4 kbp upstream or downstream from CGI)

So, if the site is >4kbp from the island, it will be labeled with ".".

[source: https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Methylation_LO_Pipeline/]

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Kevin Blighe71k

Thanks for the reply. I knew this but often in papers I come across Open Seas so thats why this question.

ADD REPLYlink written 2.3 years ago by noorpratap.singh300

I guess that you could call all of those as 'open seas'. I have not seen this definition used widely, but noted it in this publication: Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome.

Provided you also clearly define it in your methods, I would not necessarily see any major issue with it.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Kevin Blighe71k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1460 users visited in the last hour
_