Question: General questions about running GSEAPreranked
0
gravatar for mmccarthy781
13 months ago by
mmccarthy78110
mmccarthy78110 wrote:

Hey all,

I'm currently learning about GSEA in the hopes of using it in my analysis of differentially expressed genes , and I just had a few questions about the program, specifically about GSEAPreranked, which I need cleared up.

1) On the ranked list needed for GSEA input, should the list include all genes, or only those that pass a certain threshold of significance (i.e. fold change higher than 2, p value less than 0.05, etc.)? Ideally I'd like to sort the genes by fold change alone as I don't trust my p values as much, so should I only include genes with high fold changes?

2) I am comparing multiple conditions of disease with different treatments. Am I correct that GSEA only compares two conditions? If this is the case should I run GSEA for each control/treatment comparison? Would this be conventional?

Thanks for the help!

gsea gseapreranked • 752 views
ADD COMMENTlink modified 13 months ago by Samuel Brady240 • written 13 months ago by mmccarthy78110
1
gravatar for Samuel Brady
13 months ago by
Samuel Brady240
Samuel Brady240 wrote:

Regarding question 1: You should include all genes in GSEA Preranked mode, not just the differentially expressed ones.

Regarding question 2: Yes, GSEA and GSEA Preranked only compare 2 samples or conditions. You have two options. (1) Using GSEA divide your samples into two groups and rank the genes by a metric such as a p-value or average fold change between the two groups. I would not run GSEA for each treatment/control comparison; just run it once using a ranking system that incorporates all replicates in each group. (2) Run ssGSEA within the GSVA package to get signature scores for all of your signatures of interest in each sample. You will then have a signature score x sample matrix instead of a gene x sample matrix, which is a beautiful thing. ssGSEA would be my preference. Example code to run it is here.

ADD COMMENTlink modified 13 months ago • written 13 months ago by Samuel Brady240

Thanks for the response, it's really informative and is just what I was looking for.

I just have one question about it that I would like clarified. When you say that you would not run GSEA for each treatment/control comparison, do you mean for individual replicates you wouldn't run GSEA? I feel that if I were looking at different drugs, for instance the effect drug X has on a disease vs drug Y when compared to a control with no treatment, I would need to do two comparisons, X vs the control and Y vs the control. Does this make sense?

ADD REPLYlink written 13 months ago by mmccarthy78110

Yes, I am saying I would not run GSEA for each replicate pair. So if you have control1, control2, control3 (replicates), and treated1, treated2, treated3, I would not do control1 vs. treated1, control2 vs. treated2, etc. Rather I would rank the genes in a way that all replicates are incorporated into my gene ranks and run GSEA a single time.

Yes, you would do drug X vs. control, drug Y vs. control, etc. if you are trying to figure out what each drug does.

ADD REPLYlink written 13 months ago by Samuel Brady240

Unluckily the example code link is no longer valid. Do you know where it has moved or do you have a copy of your own?

ADD REPLYlink written 4 months ago by Michi940
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 582 users visited in the last hour