Question: Processing Raw Illumina Data From Geo
0
gravatar for Andre
5.6 years ago by
Andre0
Germany/Ulm
Andre0 wrote:

Hello,

these are my first steps in processing the data from Illumina (HumanMethylation450). I'm trying to process raw data from http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE37066

require(GEOquery)
getGEOSuppFiles(GEO="GSE37066")

After extracting the raw files, I see that, besides the Chip description there are raw AB intensities. I would like to use lumi or methylumi packages to read the actual data and perform background correction, normalization, etc. to compare the results. However, when executing

require(lumi)
example.lumiMethy <- methylumiR("GSE37066/GSE37066_raw_AB_intensities.txt", sep="\t")

I got an error: "Error in if (dattypes$original[i] != "") { : argument is of length zero" As far as I understand the file GSE37066_raw_AB_intensities.txt is not created by the BeadStudio, right? Is there any method which I can use to convert this data to "MethyLumiSet"? Or is the provided raw data not enough? What do I miss?

geo • 5.2k views
ADD COMMENTlink modified 5.6 years ago by Charles Warden6.6k • written 5.6 years ago by Andre0
2
gravatar for Charles Warden
5.6 years ago by
Charles Warden6.6k
Duarte, CA
Charles Warden6.6k wrote:

The contents of GSE37066_RAW are all annotation files. They are not directly useful for analysis.

It is possible that you could make use of the signal intensities, but I would just use the beta value and detection p-values from primary data matrix:

ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE37nnn/GSE37066/matrix/GSE37066_series_matrix.txt.gz

I'm not certain about lumi in particular, but you should be able to reformat this matrix to make a data table suitable for analysis. For example, that is how I analyze public 450k data using COHCAP:

http://sourceforge.net/projects/cohcap/

Not sure about your level of bioinformatics experience. The reformatting can definitely be done using a Perl code, and it probably can be done within R as well. If you don't have much coding experience, I know Partek has a GEO downloader application that will make a data matrix for you (even if you don't have an institutional license, you could use a demo license for this feature). Exporting a tab-delimited text file or comma-separated file should allow an easy import into R:

http://www.partek.com/?q=partekgs

ADD COMMENTlink written 5.6 years ago by Charles Warden6.6k

Thank you! If I understand this right, the raw data is just incomplete in this case, but the working values are provided with the matrix.

ADD REPLYlink written 5.6 years ago by Andre0

Yeah - GEO won't let you upload the raw .idat files for reasons of space. Therefore, you have to work with some sort of processed data.

ADD REPLYlink written 5.6 years ago by Charles Warden6.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 678 users visited in the last hour