I am working with a cohort consisting of Illumina HumanMethylation 450K and EPIC arrays. I m using the ChAMP pipeline to load the data, because of the extensive probe filtration step. To merge the different arrays I use the minfi package. In a first step, I convert the beta-matrix from ChAMP to a GenomicRatioSet and create a virtual array of a given type.
For casting a 450K virtual array:
ratioSet450K <- makeGenomicRatioSetFromMatrix(myLoad_450K$beta, array = "IlluminaHumanMethylation450k", annotation = "ilmn12.hg19", what = "Beta") ratioSetEPIC <- makeGenomicRatioSetFromMatrix(myLoad_EPIC$beta, array = "IlluminaHumanMethylationEPIC", annotation = "ilm10b4.hg19", what = "Beta") # conservative merging ratioSetMerged450K <- combineArrays(ratioSet450K, ratioSetEPIC, outType = "IlluminaHumanMethylation450k", verbose = T)
However, when I use outType = "IlluminaHumanMethylationEPIC" as indicated in the manual I get the same virtual array as above. Hence, the function takes the intersect of my cg-probes. But the manual explicitly states:
This function combines data from the two different array types and outputs a data object of the user-specified type. Essentially, this new object will be like (for example) an EPIC array with many probes missing.
I assumed the missing probes would be treated as NAs. Has anybody a solution for this? For the moment I am using dplyr's left_join to circumvent the problem, but it is not a very elegant workaround.