I am planning to work with several methylation Illumina 450K datasets from the GEO Expression Omnibus database. I want to use any R package available to do normalization and other QC steps as well as to remove batch effects. I’ve tried using lumi, methylumi and minfi.
The problem is that I am getting errors when reading the files available at GEO with lumi/methylumi/minfi. If I understand it OK, the infile for these packages is the outfile of GenomeStudio (the Final Report). However, this file or original idat files are not in GEO.
The files in GEO Database are: 1- one matrix with beta values for all individuals (series matrix) 2- one file with methylated and unmethylated probe signal intensities (in some cases p-values too) 3- RAW data containing: manifest_header_descriptions, csv, bpm files
My questions are: 1- How can I convert the files in GEO to generate the input file for lumi/methylumi/minfi to do QC steps? Any preferences for packages? 2- In case I have to parse the input file myself, where can I found find a template of GenomeStudio outfile (including COLOR_CHANNEL column)? 3- How can I combine different GEO datasets to perform joint QC assessment?
Thank your for your help!