How to prepare ranked-gene list for statistical enrichment test in PANTHER database
0
0
Entering edit mode
3.9 years ago
kousi31 ▴ 100

Hi all,
As per PANTHER manual, statistical enrichment test uses Mann-Whitney U Test (Wilcoxon Rank-Sum Test) to find enriched pathways. It requires a input file with two columns (gene id and expression values). It did not clearly specify what the expression values should be.

I have prepared a ranked gene-list by calculating expression scores (sign of logFC * log10pvalues) as mentioned in GSEA manual. But, GSEA uses modified kolmogorov-smirnov test.

My doubt is whether the expression score used for GSEA can be used for PANTHER enrichment also? Need expert opinion. Thank you!

panther ranked-gene-list pathway-enrichment • 1.9k views
ADD COMMENT
0
Entering edit mode

Can you link the tool you use? Doesn't PANTHER simply require a list of gene names?

ADD REPLY
0
Entering edit mode

I used this PANTHER link. Input requirements are mentioned here.

This was mentioned in the manual, under Statistical enrichment test.

For each molecular function, biological process, cellular component, PANTHER protein class, or pathway term in PANTHER, the genes associated with that term are evaluated according to the likelihood that their numerical values were drawn randomly from the overall distribution of values. The Mann-Whitney U Test (Wilcoxon Rank-Sum Test) is used to determine the P-value that, say, the chromatin packaging and remodeling genes have random values relative to overall list of values that were input. To use this test, make sure that the numerical values are included in the input file. See File format for more details. This approach has been used by our group (Clark et al., Science 302: 1960, 2003) [8] and is similar to a method from Eric Lander’s group (Mootha et al., Nature Genetics 34: 267, 2003) [9], to find weakly coordinated shifts that elude methods based on defining strict cutoffs in the data, e.g. only focusing on genes whose expression has changed by over 1.5- or 2-fold. For the rank-sum test, it is important to provide values for as many genes as possible (subject to noise level and reliability) so that randomness can be properly assessed across the experiment.

The file format link did not open. So I calculated the expression score as per GSEA manual as mentioned in the question

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing comments to keep threads logically organized. SUBMIT ANSWER is for new answers to the original question.

ADD REPLY

Login before adding your answer.

Traffic: 2968 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6