What'S The Meaning Of "Random Walk" In The Gsea Paper Published By The Broad Institute?
1
4
Entering edit mode
8.0 years ago
rolyata47 ▴ 40

The Gene Set Enrichment Algorithm, outlined in this paper, http://www.broadinstitute.org/gsea/doc/subramanian_tamayo_gsea_pnas.pdf, refers often to a "random walk" used to traverse the ranked list L of gene-to-phenotype correlations.

However, what they actually do in the paper does not look like a random walk at all. It seems to me that they traverse the ranked list L sequentially, from rank 1 (highest correlation) onwards.

I was wondering if anyone could clear up the confusion of what they mean by "random walk", and why they use the term, when really it looks like they are doing a sequential walk, quite the opposite.


Also, as a follow-up question, how is it that they do not bias the top of the ranked list L over the bottom? If we assume for the moment that they are doing a sequential walk, which seems to be the case, then the gene sets found at the bottom extreme will have a larger value for P_miss, since P_miss is proportional to i. As a consequence, they will have smaller enrichment scores.

Perhaps this is related to the question above, since a sequential walk does not seem to work here...

I appreciate any help... I suspect I am not understanding something correctly...

• 4.0k views
ADD COMMENT
5
Entering edit mode
ADD COMMENT
0
Entering edit mode

Hey, thanks! I think this article made it clear. They are comparing the supremum (ES) with what it would be for a random walk... gene sets found at the top or the bottom will have a higher ES, and gene sets that are randomly distributed will resemble a random walk - thanks!

ADD REPLY

Login before adding your answer.

Traffic: 2414 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6