Question: Reading methylation without IDAT files downloaded from GEO
8 months ago by
Hello All,

I am working on a GEO series methylation dataset (GSE) that I have downloaded from the GEO site. The downloads include three different files, 1) GSE_Matrix_methylated_signal_intensities.txt.gz 2) GSE_Matrix_methylated_signal_intensities.txt.gz 3) GSE_RAW.tar, since there is no IDAT file provided for this dataset I am looking to find a way to process the meth and unmeth files. Both the meth and unmeth files have raw intensities for each CpG across different samples, but it doesn't enlist the detection P-values for each CpG. Is there a way to process this data? In the past I have used minfi to do this, but I've always had the IDAT files.

I appreciate all help and comments.

8 months ago by
Charles Warden6.9k
I remember separately providing .idat files for at least one project (outside of GEO). While the detection p-values may be helpful for filtering certain probes in certain samples, I think they should have provided percent methylation / beta values (although in that particular project, I think the GEO format did request separated signal and detection p-values).

However, I can see some more recent 450k datasets where you could download .idat files in a GEO supplemental .tar file, so you could try contacting the authors about those files.

You can also update GEO submissions, so I will ask about updating the .idat files for the dataset that I am thinking of, and provide an updated comment. However, you may end up finding a solution for this particular dataset sooner by contacting the corresponding author (and/or by calculating the percent methylation from the separate methylated and unmethylated signals; you have to make sure you don't divide by zero, but one explanation for the beta value is provided here).

FYI, I have confirmed that you can add .idat files to an earlier project.

Thank you! I will reach out to the person who uploaded the data and ask for IDAT's.

