I have used minfi for pre-processing and normalization, there are some questions about minfi's probe filtering( p-value detection, bead count, SNP). As i know, minfi removes SNP probes by default, there is a way to remove probe with high p-value, too but how can i remove probe with certain bead counts? there is not any function in minfi tutorial for bead count removing. second problem is normalization. I need both between array and within array normalization which is supported with preprocessSWAN() and preprocessFunnorm() but input and output of these normalization functions are not consistence. If i want to use Swan normalization before or after Funnorm, input of one of them and another ones output wont be the same. How can i perform both normalizations ?
There's definitely some issues with what you're asking, I'll try and hit them off one by one.
Minfi will not remove probes with SNPs by default in the CpG, probe sequence or SBE. You'll need to use the
dropLociWithSnps() function, with an additional
maf argument, which specifies your minor allele frequency cutoff.
For detection P value filtering, the older versions of the minfi guide do include some clues as to how to do it. The idea is to identify these probes prior to normalisation, and remove them post-normalisation. Here's an example where
raw_idat is the raw data read using
read.metharray.exp, which removes probes where their detection p value is >0.01 in 50% of samples:
lumi_dpval <- detectionP(raw_idat, type = "m+u") lumi_failed <- lumi_dpval > 0.01 lumi_dpval_remove <- names(which(rowMeans(lumi_failed)>0.5, TRUE)) rm(lumi_dpval, lumi_failed); gc(); set.seed(73) norm_data <- preprocessFunnorm(raw_idat, bgCorr = T, dyeCorr = T,verbose = T) remove <- match(lumi_dpval_remove,rownames(norm_data))) %>% unique %>% na.omit norm_data_f <- norm_data[-remove,]
In terms of normalisation, you do a single method, do not combine them unless in very specific circumstances. I believe that
SWAN, but with extra steps to regress out technical variation based on control probes. Also, it should be noted that while I believe
SWAN normalisation is deterministic, the
preprocessFunnorm() method is not, so set the seed first as per my example above.
If you're still convinced that you should be using both
preprocessSWAN, then please expand on why.