I'm using R in order to get expression values from raw microarray experimental data (downloaded from GEO). After I obtained the expression values I'd like to know which probes I should remove. Looking around I saw people using the method nsFilter. I need to analyze the expression values using a data mining approach (I'm not a biologist). Is it enough remove all the control probes? Should I keep all the other probes even the ones with low variance? What is a typical approach?
At the end of the day it depends completely on your goals (N.B., saying "...a data mining approach" is rather like saying that you'll go from point A to point B with a "movement approach"). In any case, you'll likely want to get rid of probes lacking significantly higher signal than that seen in control probes, as these likely represent background noise.