Question: *.rnk file preparation for GSEA analysis
gravatar for o.mikh90
17 months ago by
o.mikh9030 wrote:


I've performed RNA-seq comparing expression profiles of wild type vs mutant mouse line, and got to a list of differentially expressed genes thanks to DESeq2. Now I'm looking into performing Gene Set Enrichment Analysis using either the tool from Broad Institute or the online tool Webgestalt Both require to prepare an input file of (*.rnk) format that would contain gene names in one column and a corresponding rank value in the other. Here's the guideline from Broad: "Prior to conducting gene set enrichment analysis, conduct your differential expression analysis using any of the tools developed by the bioinformatics community (e.g., cuffdiff, edgeR, DESeq, etc). Based on your differential expression analysis, rank your features and capture your ranking in an RNK-formatted file. The ranking metric can be whatever measure of differential expression you choose from the output of your selected DE tool. For example, cuffdiff provides the (base 2) log of the fold change."

Let's say I filtered my gene list by FDR <0.01 and |FC|>1.5, and I now want to perform GSEA to find out which gene sets are over/underrepresented in my mutant line. How exactly should I get to the ranked gene list (.rnk)? Is it as trivial as using the R rank() function?

Thanks a lot in advance

ADD COMMENTlink modified 17 months ago • written 17 months ago by o.mikh9030

Filter by FDR looks good for me, however filtering by an arbitrary FC may be a source of bias. I usually filter by FDR, and use the complete matrix (log2fold as rank). Ranked list for Webgestalt is a two column list, gene and rank (logfold).

ADD REPLYlink written 17 months ago by Buffo1.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2843 users visited in the last hour