Question

Should I aggregate probeset expression by mean instead of considering them as independent instances?

0

Entering edit mode

9.8 years ago

Juan Cordero ▴ 140

Dear comm,

I am a newbie in microarray data analysis, and after reading an increasing amount of tutorials and forums I am getting mixed up.

In order to study which genes are differentially expressed, once data have been normalized, many people usually perform the subsequent analyses considering looking at probes as if they were independent instances. In fact, some people filter the data to keep only the most informative probe within a given probeset (e.g. with the highest value for the t-statistic). However, I've been said different probes within the same probeset tend to show widely variable expression values, and that's why it makes sense to aggregate all probes belonging to a probeset, and consider the mean value as the expression of the gene they map to. If this last argument is true, should I forget all tutorials that tackle the problem in a probe-wise basis?

I am mainly analysing 3'UTR microarrays.

Microarray-data Bioconductor R • 2.6k views

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 9.8 years ago by Juan Cordero ▴ 140

Ram · Accepted Answer · 2015-10-08

Have you seen this page? It is short but touches on several problems about multiple probe summarization. Here on Biostars similar questions appeared several times, e.g., here and here.

In summary, there are several opinions, and microarrays will disappear before this question is settled (and for several people, microarrays disappearing IS the answer to this question). Finally, for a more or less recent approach to this problem, see this paper.