Question: General questions about running GSEAPreranked
gravatar for mmccarthy781
3.1 years ago by
mmccarthy78110 wrote:

Hey all,

I'm currently learning about GSEA in the hopes of using it in my analysis of differentially expressed genes , and I just had a few questions about the program, specifically about GSEAPreranked, which I need cleared up.

1) On the ranked list needed for GSEA input, should the list include all genes, or only those that pass a certain threshold of significance (i.e. fold change higher than 2, p value less than 0.05, etc.)? Ideally I'd like to sort the genes by fold change alone as I don't trust my p values as much, so should I only include genes with high fold changes?

2) I am comparing multiple conditions of disease with different treatments. Am I correct that GSEA only compares two conditions? If this is the case should I run GSEA for each control/treatment comparison? Would this be conventional?

Thanks for the help!

gsea gseapreranked • 1.7k views
ADD COMMENTlink modified 3.1 years ago by Samuel Brady320 • written 3.1 years ago by mmccarthy78110
gravatar for Samuel Brady
3.1 years ago by
Samuel Brady320
Samuel Brady320 wrote:

Regarding question 1: You should include all genes in GSEA Preranked mode, not just the differentially expressed ones.

Regarding question 2: Yes, GSEA and GSEA Preranked only compare 2 samples or conditions. You have two options. (1) Using GSEA divide your samples into two groups and rank the genes by a metric such as a p-value or average fold change between the two groups. I would not run GSEA for each treatment/control comparison; just run it once using a ranking system that incorporates all replicates in each group. (2) Run ssGSEA within the GSVA package to get signature scores for all of your signatures of interest in each sample. You will then have a signature score x sample matrix instead of a gene x sample matrix, which is a beautiful thing. ssGSEA would be my preference. Example code to run it is here.

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by Samuel Brady320

Thanks for the response, it's really informative and is just what I was looking for.

I just have one question about it that I would like clarified. When you say that you would not run GSEA for each treatment/control comparison, do you mean for individual replicates you wouldn't run GSEA? I feel that if I were looking at different drugs, for instance the effect drug X has on a disease vs drug Y when compared to a control with no treatment, I would need to do two comparisons, X vs the control and Y vs the control. Does this make sense?

ADD REPLYlink written 3.1 years ago by mmccarthy78110

Yes, I am saying I would not run GSEA for each replicate pair. So if you have control1, control2, control3 (replicates), and treated1, treated2, treated3, I would not do control1 vs. treated1, control2 vs. treated2, etc. Rather I would rank the genes in a way that all replicates are incorporated into my gene ranks and run GSEA a single time.

Yes, you would do drug X vs. control, drug Y vs. control, etc. if you are trying to figure out what each drug does.

ADD REPLYlink written 3.1 years ago by Samuel Brady320

Unluckily the example code link is no longer valid. Do you know where it has moved or do you have a copy of your own?

ADD REPLYlink written 2.4 years ago by Michi950
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1601 users visited in the last hour