Question: How to map probeset associated statistics to gene statistics in microarray differential expression analysis?
0
gravatar for moxu
2.7 years ago by
moxu470
moxu470 wrote:

For microarray data, differential expression analysis is done for each probeset. The problem is that one gene is typically mapped to multiple probesets. Since for most if not all practical reasons, we are interested in differential expression at the gene but not probeset level, I am wondering what's the best way to map probeset analysis into gene analysis. For instance, there are multiple probesets for a gene and each probeset has a p-value, fold-change, etc. When we map the probelets into the corresponding gene, shall we take the probeset with the smallest p-value and use its statistics for the gene? Or median p-value? Mean? ...?

Thanks in advance!

rna-seq R gene • 1.1k views
ADD COMMENTlink modified 2.7 years ago by Kevin Blighe67k • written 2.7 years ago by moxu470
4
gravatar for Kevin Blighe
2.7 years ago by
Kevin Blighe67k
Republic of Ireland
Kevin Blighe67k wrote:

NB - added July 31, 2020: see also C: Human Exon array probeset to gene-level expression

----

For microarray analysis, during RMA normalisation, there is one key function parameter that relates to your question: target

summarise probe-level expression to genes (or exons)

rma(MyCELfiles, background=TRUE, normalize=TRUE, target="core")

Functionality of this depends on the array type. If you have a 'Gene' array, then expression is summarised to genes. If you have an 'Exon' array, then it will be summarised to Exons.

summarise at probe-set level

rma(MyCELfiles, background=TRUE, normalize=TRUE, target="probeset")

---------------------------------------

Two further options are available for 'Exon' arrays:

  • target = ’full'
  • target = ’extended’

If you still cannot obtain the correct level of summarisation with these, then just summarise by mean via the aggregate() function.

Kevin

ADD COMMENTlink modified 3 months ago • written 2.7 years ago by Kevin Blighe67k
1

I used geo2r offered by GEO DB, and it generates p-value, logFC, etc.

Thanks!

ADD REPLYlink written 2.7 years ago by moxu470
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1444 users visited in the last hour