Data in GEO 450k series matrices
2
3
Entering edit mode
6.0 years ago
dh ▴ 30

What methylation values are usually supplied within ncbi geo 450k series matrices?

I am going to automatically download a lot of GSE (~80) matrices and to use them in my analysis. Before this I want to double check here if I understand correctly what data is within those matrices.

I know that they contain sample, contact information, etc. But what about methylation data? Are those values already normalised, batch corrected? Or does this vary between each study? 

geo • 2.2k views
ADD COMMENT
5
Entering edit mode
6.0 years ago

The methylation values are included in the GSE, yes.  The values vary between studies.  In general, they will be normalized, but the normalization methods will vary from GSE to GSE.  

ADD COMMENT
0
Entering edit mode

Sean, thank you for the answer!:)

One more quick question: in this fresh draft by Kasper D. Hansen they have analysed 450k data. For 2 datasets they have downloaded IDAT files and then normalised them. For one dataset they didn't apply normalisation, because only methylated, unmethylated intensities were available (Material and Methods - Processing of the methylation data).

I guess that K. Hansen knows what he does, but will this work for every dataset? For example: If I download only methylated/unmethylated values can I skip normalisation and go on further?

ADD REPLY
1
Entering edit mode

You'll need to evaluate the need for normalization on a case-by-case basis.  Typically, values deposited in GEO have been normalized, but the effectiveness of that normalization for your needs may vary.

ADD REPLY
3
Entering edit mode
5.9 years ago
Shicheng Guo ★ 8.7k

1, normalization methods will vary from GSE to GSE

2, the number of probes would be different, some are raw data, some have exclude the probes in sex chrosomes

3, not every GSE would do background normalization

4, not every GSE would do batch effect elimination

5, not every GSE would do color-bias adjustment

 

Therefore, be careful when you use GEO dataset. 

ADD COMMENT

Login before adding your answer.

Traffic: 1871 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6