I got a question regarding TCGAs Methylation data. I downloaded it via the "Data Matrix", 98 Samples, 49 pairs of Tumor/Normal tissue, all Level 3 and all from Illuminas HumanMethylation450 Chip. Everything is prostate cancer. So, looking at the data, there ~25% of the symbols missing. Therefor I tried to reannotate them, getting the symbols from a public SQL-Server (genome-mysql.cse.ucsc.edu). But most of the positions don't match hg18 (as mentioned at TCGAs wiki: https://wiki.nci.nih.gov/display/TCGA/DNA+methylation), I get matches when using hg19 data base.
Talking numbers: I got XXXX Positions, 119652 don't have a symbol, after reannotation (hg19),
Could it be possible that the data is annotated with hg19 instead of 18?
Also I was asking, how to interpret this "Some data have been masked (including known SNPs)". If some points are masked, in which way they are? Has the position been changed and symbol removed?
More or less the data methylation points with symbols seem to be useless at the moment.
May one has an idea or is experienced with TCGAs methylation data.
With all the best,
Perfect, this is what I was looking for. Thanks you very much!