Methylation data duplicated for same gene
0
0
Entering edit mode
6.5 years ago
mms140130 ▴ 60

Hi,

I downloaded Methylation data using package TCGA2STAT in R as follows

methyl<- getTCGA(disease="BRCA", data.type="Methylation", type="27K")

it has two outputs the methyl$dat which has the data

head(methyl$dat[,1:3])
           TCGA-01-0628-11A-01D-0383-05 TCGA-01-0630-11A-01D-0383-05 
cg00000292                   0.79940858                   0.62039417                   
cg00002426                   0.33900444                   0.18030460                   
cg00003994                   0.02811930                   0.03607298                   
cg00005847                   0.60116497                   0.64955777                   
cg00006414                           NA                           NA                           
cg00007981                   0.01881682                   0.01803597

and the gene annotation methyl$cpgs

            Gene_Symbol Chromosome Genomic_Coordinate
cg00000292        ATP2A1         16           28890100
cg00002426         SLMAP          3           57743543
cg00003994         MEOX2          7           15725862
cg00005847         HOXD3          2          177029073
cg00006414 ZNF425;ZNF398          7          148822837
cg00007981         PANX1         11           93862594

the problem is I get the same gene with different methylation values as follows:

A2ML1   0.85332099  0.422268191 0.28015569  0.61441715  0.231997855
A2ML1   0.691462014 0.420426417 0.195839615 0.575344397 0.151897964
A4GALT  0.066524012 0.041965822 0.100817531 0.17217131  0.117686942
A4GALT  0.432681922 0.182219229 0.618835095 0.26247578  0.671077877
A4GNT   0.86171353  0.821129689 0.814334155 0.67838202  0.874795198

and I want to build a multiple linear regression geneexp_i = alpha+ beta1.CNV_i + beta2 METH_i + error , where i represent gene so the gene is counted once no duplicates but in the methylation file it is same gene duplicated with different methylation data

what can I do??

gene R • 1.4k views
ADD COMMENT
0
Entering edit mode

They are targeting different sites, so, will show different levels. What are the actual probe IDs behind the two A2ML1 and A4GALT genes shown in your bottom table?

ADD REPLY
0
Entering edit mode

For The A2ML1 I have cg27653134 and cg03490200 for A4GALT they are cg07393322 and cg09744051

ADD REPLY

Login before adding your answer.

Traffic: 1365 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6