How to combine expression values of multiple probes for one gene?
1
1
Entering edit mode
2.4 years ago
bhargavdave1 ▴ 10

I am using microarray data of stress response from Human Brain Atlas(http://human.brain-map.org/microarray/search/show?search_term=58&search_type=gene_classification) In which total 1149 probe present. I want to combine multiple probes expression value into one particular gene expression value. I want to know is there any proper method available for combine probes expression value into gene? If available that give me reference of that book or research paper. if any R package available for that also give that.

note: I want to do analysis on gene expression data for that i require this

probe R gene expression combine gene • 2.2k views
6
Entering edit mode
2.4 years ago

This should be done on a 'case by case' basis, and summarising the expression should be justified.

limma provides easy functionality for this, as follows:

data_summarised <- limma::avereps(
data,
ID = gene)


Here, gene is a vector of genes that correspond to the rownames of data. This will summarise by mean, for each sample (column), across common values in the vector gene.

This function was initially developed to summarise across replicate probes.

Reproducible example:

a <- matrix(rexp(200, rate=.1), ncol=20)
rownames(a) <- c(rep("a", 5), rep("g", 5))
limma::avereps(a, ID = rownames(a))
[,1]     [,2]     [,3]     [,4]     [,5]      [,6]     [,7]     [,8]
a 4.086404 2.436660 8.220130 10.36580 11.46436 17.689969 14.42429 13.53203
g 8.271113 7.593843 9.395702 14.56003 13.07174  9.928446 18.92534 14.84183
[,9]     [,10]    [,11]    [,12]     [,13]     [,14]     [,15]     [,16]
a 10.93435 11.245397 14.44341 11.09513 15.632908  6.982594  5.212455 11.748156
g 19.31578  5.836391 11.37889 10.89469  4.175368 14.668283 10.516478  7.597563
[,17]    [,18]     [,19]     [,20]
a  5.840073 2.898039 11.821731 12.896772
g 15.031233 6.441335  3.950631  3.877679


## -------------------------------------------------

There is also another function, aggregate(), which can summarise by any mathematical formula.

Kevin

0
Entering edit mode

thanks for the answer, If you refer some research paper which uses the same method. Combine prob expression into gene using mean or aggregate. I want to use for my academic.because of that I require some paper and is this method reliable.

0
Entering edit mode

Your reference is Gordon Smyth of limma. That supersedes everything else in bioinformatics :)

No, seriously, it is a standard procedure in microarray and gene expression analysis - look in the limma manual, at least for justification for the procedure in microarrays. If you look at published works, you may or may not see it mentioned in the methods, depending on whether or not the analyst writing the methods decided to mention it or not.

Of course, there are other ways of summarising data when transcript isoforms come into question. For that, I will refer you to the tximport, DESeq2, and EdgeR manuals.