Hi, I want to do GSEA analysis in R on significantly differentially expressed genes on nonmodel species (five in total).
My research is based on cross-species comparative transcriptomics. And this is what I am doing:
- I already have species-specific: de novo assemblies, annotations (across 7 different databases), quantification (read counts), CDS predictions...
- Next, I did transdecoder and selected the longest ORFs, and used these peptide sequences to detect my single copy orthologues across species with Orthofinder
- I assigned gene length and gene counts to create a Gene expression matrix.
Now I am planning to do differential expression analysis on my orthologues (still learning what is the best approach since I have to do some kind of normalization to account for different species/transcriptomes). I guess this is another topic...
Let's say I have my DEG list and I want to do GSEA. I learned how to do that in R for human RNAseq data and one step is loading the human database (https://www.gsea-msigdb.org/gsea/msigdb). My question is what should I do when I have nonmodel species, how do I make my databse?
Can I make it from my annotations list, if yes, how do I do that?
p.s. I am working on isopods with no reference genomes. :)
Ok, great! thank you so much for your quick response! I was thinking of the Cluster profiler since I already tried to use that one, but it's good to know there are other options too.