Is there an informative subset of 450k methylation probes?
1
1
Entering edit mode
9.1 years ago
rmccloskey ▴ 240

I'm working on an analysis which includes 450k methylation data. There are so many probes that analysing the whole data set is becoming a problem in terms of time and memory. I'm sure that nearby methylation sites are highly correlated, so is there some kind of informative subset of the whole probset I could use, to reduce computational costs without losing too much information? I'm aware that it's possible to do this myself using clustering or something, but I was hoping it had been done already.

methylation 450k illumina • 2.2k views
ADD COMMENT
0
Entering edit mode
9.1 years ago

While the correlation of nearby CpG sites is an assumption made in the probe design, I think it is best to take advantage of as much information as possible.

I (and others) have done some work on trying to define differentially methylated regions from 450k data. I have some templates for analysis for a couple programs here:

How many samples do you need to analyze? For small cell line datasets, I think the above tools should be OK for most desktops (but I agree that large patient cohorts may need to be run on a more powerful Linux cluster).

ADD COMMENT
0
Entering edit mode

I have 389 samples. I agree that it's best to use as much information as possible, but I'm already running on a cluster and am still having memory issues.

ADD REPLY

Login before adding your answer.

Traffic: 2620 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6