Question: Doing GSEA analysis on a set of genes
gravatar for c_u
3.2 years ago by
United States
c_u290 wrote:

Forgive me if this question has been answered before, but my searching on this site did not provide the precise answer that I need.

I have RNA-seq data for 2 conditions, A and B. I have the unique gene IDs, and their corresponding reads, which I converted to TPMs. Then, I calculated the p-value for these 2 conditions for all the genes. Now since I have the expression levels also, I created a volcano plot which basically gave me the list of genes that have a significant p-value for these conditions (<0.05) AND for which the fold-change is 2 times or higher for condition B as compared to condition A. So now I have a list of ~200 genes with their corresponding p-values.

Now, I need to perform GSEA on this list of genes, to see if this list of genes is enriched in any biological pathway etc. So, I wanted to know how I can do that. I have installed the GSEA software to my Ubuntu system, and I think (based on what I understood from the GSEA user guide) I need to choose the GSEA mode where I provide my own gene set, but I am not very sure.

Any help would be great!

gsea rna-seq enrichment • 3.3k views
ADD COMMENTlink modified 3.2 years ago by EagleEye6.7k • written 3.2 years ago by c_u290
gravatar for Kevin Blighe
3.2 years ago by
Kevin Blighe69k
Republic of Ireland
Kevin Blighe69k wrote:

As mentioned by the others, there are limitations to gene-set enrichment analysis about which you need to be aware. They are in silico analyses and the results need to be taken with a grain of salt until proven by another gold standard method.

One that I recently tried was GSVA - gene-set variation analysis. Using this, you can compare your genes against the Molecular Signatures Database (MSigDB) curated gene-sets that are held at The Broad Institute: For pathways, you'd want to use the 'C2' gene set, but you cn actually load it directly into R and GSVA as the c2BroadSets package, I believe (see the GSVA tutorial).

Another one that many people use is DAVID:

ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by Kevin Blighe69k
gravatar for Prakash
3.2 years ago by
Prakash2.1k wrote:

you can follow the steps provided in this link

ADD COMMENTlink written 3.2 years ago by Prakash2.1k
gravatar for andrew
3.2 years ago by
United States
andrew510 wrote:

Please be aware of the severe limitations of GSEA or any enrichment-based analysis. GSEA will only identify those pathways with greater number of significant genes (than expected by random chance) and will not help establish if those significant genes are actually contributing or driving a change in the pathway's regulation. You should try something like iPathwayGuide which does Enrichment and Perturbation analysis for each pathway among other things. Just be sure to submit the entire list of genes that were measured from your RNA-seq experiment and then choose the thresholds (0.05 & 2.0) to select your significant genes. You can analyze 3 datasets for free. Have fun!

ADD COMMENTlink written 3.2 years ago by andrew510

Oh, ok. But since I would need to stick to a particular method for doing this multiple times, I think I would try to find something that is free. Thanks for the info!

ADD REPLYlink written 3.2 years ago by c_u290
gravatar for EagleEye
3.2 years ago by
EagleEye6.7k wrote:

Gene Set Clustering based on Functional annotation (GeneSCF)

ADD COMMENTlink written 3.2 years ago by EagleEye6.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1206 users visited in the last hour