Question: Doing GSEA analysis on a set of genes
gravatar for chahat_u
17 months ago by
United States
chahat_u110 wrote:

Forgive me if this question has been answered before, but my searching on this site did not provide the precise answer that I need.

I have RNA-seq data for 2 conditions, A and B. I have the unique gene IDs, and their corresponding reads, which I converted to TPMs. Then, I calculated the p-value for these 2 conditions for all the genes. Now since I have the expression levels also, I created a volcano plot which basically gave me the list of genes that have a significant p-value for these conditions (<0.05) AND for which the fold-change is 2 times or higher for condition B as compared to condition A. So now I have a list of ~200 genes with their corresponding p-values.

Now, I need to perform GSEA on this list of genes, to see if this list of genes is enriched in any biological pathway etc. So, I wanted to know how I can do that. I have installed the GSEA software to my Ubuntu system, and I think (based on what I understood from the GSEA user guide) I need to choose the GSEA mode where I provide my own gene set, but I am not very sure.

Any help would be great!

gsea rna-seq enrichment • 1.7k views
ADD COMMENTlink modified 17 months ago by EagleEye6.2k • written 17 months ago by chahat_u110
gravatar for Kevin Blighe
17 months ago by
Kevin Blighe39k
Republic of Ireland
Kevin Blighe39k wrote:

As mentioned by the others, there are limitations to gene-set enrichment analysis about which you need to be aware. They are in silico analyses and the results need to be taken with a grain of salt until proven by another gold standard method.

One that I recently tried was GSVA - gene-set variation analysis. Using this, you can compare your genes against the Molecular Signatures Database (MSigDB) curated gene-sets that are held at The Broad Institute: For pathways, you'd want to use the 'C2' gene set, but you cn actually load it directly into R and GSVA as the c2BroadSets package, I believe (see the GSVA tutorial).

Another one that many people use is DAVID:

ADD COMMENTlink modified 17 months ago • written 17 months ago by Kevin Blighe39k
gravatar for Prakash
17 months ago by
Prakash730 wrote:

you can follow the steps provided in this link

ADD COMMENTlink written 17 months ago by Prakash730
gravatar for andrew
17 months ago by
United States
andrew460 wrote:

Please be aware of the severe limitations of GSEA or any enrichment-based analysis. GSEA will only identify those pathways with greater number of significant genes (than expected by random chance) and will not help establish if those significant genes are actually contributing or driving a change in the pathway's regulation. You should try something like iPathwayGuide which does Enrichment and Perturbation analysis for each pathway among other things. Just be sure to submit the entire list of genes that were measured from your RNA-seq experiment and then choose the thresholds (0.05 & 2.0) to select your significant genes. You can analyze 3 datasets for free. Have fun!

ADD COMMENTlink written 17 months ago by andrew460

Oh, ok. But since I would need to stick to a particular method for doing this multiple times, I think I would try to find something that is free. Thanks for the info!

ADD REPLYlink written 16 months ago by chahat_u110
gravatar for EagleEye
17 months ago by
EagleEye6.2k wrote:

Gene Set Clustering based on Functional annotation (GeneSCF)

ADD COMMENTlink written 17 months ago by EagleEye6.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2386 users visited in the last hour