5.6 years ago by
United States, Buffalo
I will try to give you my take on it since I did struggle for awhile understanding what that means...hopefully now I got it...
so, to run GSEA you have your list of genes (L) and two conditions (or more), i.e. a microarray with normal and tumor samples. the first thing that GSEA does is to rank the genes in L based on "how well they divide the conditions" using the probe intensity values. at this point you have a list L ranked from 1...n.
now you want to see whether the genes present in a gene set (S) are at the top or at the bottom of your list...or if they are just spread around randomly. to do that GSEA calculates the famous enrichment score, that becomes normalized enrichment score (NES) when correcting for multiple testing (FDR).
a positive NES will indicate that genes in set S will be mostly represented at the top of your list L. a negative NES will indicate that the genes in the set S will be mostly at the bottom of your list L.
let's say that S1 has positive NES and S2 has negative NES. let's say also that your list of 1000 genes is ordered form the most upregulated (top: 1,2,3,....) to the most downregulated (bottom: ....n-3,n-2,n-1,n). a positive NES for S1 will mean that genes over-represented in that gene set are upregulated in your dataset. negative NES for S2 instead indicated the opposite.
in the results you will also find a heatmap the subset of you data that belong to the signature analyzed. generally what I saw is that the more significantly enriched is the gene set, the better the division between the two conditions in the heatmap
hopefully I understood it right and this helps, otherwise, please correct me :)
Subramanian et al. 2005
modified 5.6 years ago
5.6 years ago by
TriS • 4.2k