Question: What to do with outliers from Microarray quality control?
gravatar for peter pfand
3.4 years ago by
peter pfand80
peter pfand80 wrote:

Dear comm,

I am new to Microarrays and I have just performed a quality control using arrayQualityMetrics from Bioconductor in a dataset of 18 samples from my lab. Before and after normalization, I get 3 and 2 outliers, respectively, according to the distance between arrays and the box plot distribution. At this point, could you tell me what I should do with such outliers?

Thanks in advance

ADD COMMENTlink modified 3.4 years ago by svlachavas560 • written 3.4 years ago by peter pfand80
gravatar for svlachavas
3.4 years ago by
svlachavas560 wrote:

Dear Peter,

although generally quality control is performed regardless of the downstream goals, often it is dependend on the downstream analysis you want to perform. ArrayQualityMetrics and also other packages such as array QC are excellent-however how you interpret them is different. You mentioned that you get 3 outlier samples and then two outliers, before and after normalization. I will assume that your primary target is to perform some kind of differential expression analysis. If this is the case, because i dont know your comparison and your group factor size, from the total 18 samples you have more or less an average sample size. So:

  • Firstly it would make more sence to check regarding your metrics, if these specific samples you mentioned are characterized as outliers in a significant number of these total diagnostic plots, and not i.e. in one or two plots from the total number.


  • Secondly, you should check after normalization with specific EDA plots, like a boxplot(which you will probably have performed), with a hierarchical clustering of your samples and also with the wonderful plots of limma(like plotMDS and plotMA), or/and also a PCA analysis regarding any important and specific comparisons of your samples, and see there if there is a sample/or groups that are "consistent outliers". But again, this is more an ad-hoc procedure, because if you start removing "outlier samples" maybe another sample might appear again as outlier...etc.


  • Thus, in my opinion you should perform these exploratory data analysis plots to ensure somehow your basic quality and if any weird effects appear(like also batch effects), and then you could deside if you have "consistent unreliable samples", to "down-weight" them and not remove them immediately.


On the other hand, if you wish to perform another kind of analysis, like re-construction of a co-expression network or something similar, then if you detect these "type of samples" based on multiple metrics, then yes you could remove then before continue.



ADD COMMENTlink written 3.4 years ago by svlachavas560
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1931 users visited in the last hour