Question: Microarray data gene name annotation and multiple probes mapped one gene.
0
gravatar for MatthewP
10 months ago by
MatthewP620
China
MatthewP620 wrote:

Hello, I google a lot but still confusing. First what's the major difference we get difference expression matrix(limma::lmFit) before gene name annotation, not annotate gene name to expression matrix then call diff-expr.
Second, how to choose strategy to handle multiple probes mapped to same gene? Some use average value, some use mean and some may use largest value. If I care whole gene, maybe average and mean value is better. But if I call diff-expr matirx on probe level, then I merge(combind) multiple probes data to one gene, I don't thinks is appropriate to average logFC or P.val . Can you share your experience here? Thanks.

limma microarray • 232 views
ADD COMMENTlink modified 10 months ago by Kevin Blighe59k • written 10 months ago by MatthewP620
2
gravatar for Kevin Blighe
10 months ago by
Kevin Blighe59k
Kevin Blighe59k wrote:

For your first question, assuming that you have any data matrix of n x m dimensions, it does not make any difference what are the names of the gene names. limma will fit a linear regression model to each gene independently, irrespective of what are the gene names.

For your second question, there is no clear answer without knowing the array type and version that you are using. Each array is designed differently.

For example, some Affymetrix are designed with probe-sets (a probe-set consists of multiple probes that are related to each other) that target exons, while others are designed with probe-sets that target entire genes. Probe-level summarisation for Affymetrix arrays can be controlled to some extent for RMA normalisation via the following means: How to map probeset associated statistics to gene statistics in microarray differential expression analysis?

If, after that, you still find that you have duplicate genes in your expression matrix, then you can summarise these by mean or median expression.

For Agilent arrays, well, these are again designed differently, and RMA normalisation cannot be used for these. For summarisation, however, you can use the avereps() function in limma.

I will not comment on Illumina cDNA arrays.

Kevin

ADD COMMENTlink modified 10 months ago • written 10 months ago by Kevin Blighe59k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2159 users visited in the last hour