TCGA Methylation Data and Gene Mapping
Entering edit mode
10 weeks ago
James ▴ 30

I am looking into the TCGA Methylation data and I wanted to understand how to parse the data, and, ideally, map measured beta values to single Hugo symbols.

My issues are as follows:

1) For some of the Stable Entity IDs there are multiple gene names listed, for example in the breast cancer (BRCA) data there is a row with values:

Stable Entity ID | Name | Description | Transcript ID

"cg00008493 | KIAA1409;COX8C | Body;5'UTR | NM_020818;NM_182971 |

2) Many Stable Entity IDs map to the same gene, for example, in the attached image, multiple Stable Entity IDs map to the same gene (DLX5) DLX5

For a research project I'd love to associate each gene to a specific methylation value. Put differently, for each patient I want to create a vector where each entry corresponds to a methylation value for a given gene. Is there a principled way to do this?

Methylation Cancer TCGA • 341 views
Entering edit mode
10 weeks ago
Basti ★ 1.5k

CpGs may be annotated to more than >1 gene simply because gene regions overlap on the genome.

If you want to associate each gene to a methylation value, you could take the average methylation of all CpGs for each gene. I am personally not convinced it would be a useful information because not all CpGs have a functional implication across a single gene and most of them are stable between individuals, and you will likely obtain the same mean % of methylation for all individuals.


Login before adding your answer.

Traffic: 3163 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6