*.rnk file preparation for GSEA analysis
0
3
Entering edit mode
4.6 years ago
o.mikh90 ▴ 30

Hi,

I've performed RNA-seq comparing expression profiles of wild type vs mutant mouse line, and got to a list of differentially expressed genes thanks to DESeq2. Now I'm looking into performing Gene Set Enrichment Analysis using either the tool from Broad Institute http://software.broadinstitute.org/gsea/index.jsp or the online tool Webgestalt http://www.webgestalt.org/ Both require to prepare an input file of (*.rnk) format that would contain gene names in one column and a corresponding rank value in the other. Here's the guideline from Broad: "Prior to conducting gene set enrichment analysis, conduct your differential expression analysis using any of the tools developed by the bioinformatics community (e.g., cuffdiff, edgeR, DESeq, etc). Based on your differential expression analysis, rank your features and capture your ranking in an RNK-formatted file. The ranking metric can be whatever measure of differential expression you choose from the output of your selected DE tool. For example, cuffdiff provides the (base 2) log of the fold change."

Let's say I filtered my gene list by FDR <0.01 and |FC|>1.5, and I now want to perform GSEA to find out which gene sets are over/underrepresented in my mutant line. How exactly should I get to the ranked gene list (.rnk)? Is it as trivial as using the R rank() function?

Thanks a lot in advance

RNA-Seq GSEA Enrichment analysis R webgestalt • 8.1k views
ADD COMMENT
2
Entering edit mode

Filter by FDR looks good for me, however filtering by an arbitrary FC may be a source of bias. I usually filter by FDR, and use the complete matrix (log2fold as rank). Ranked list for Webgestalt is a two column list, gene and rank (logfold).

ADD REPLY

Login before adding your answer.

Traffic: 3042 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6