Hello. Using DeSeq2 on my data (control vs treatment), I got a list of genes with their adjusted p-values and LogF2c. I observed that core genes of a specific category were being significantly downregulated. After I ran the GSEA analysis, the trend reversed, showing upregulation instead. Turns out the GO annotation it was using contained many other affiliated genes (except the core genes), which were being upregulated. My question is this: I want to be able to use the GSEA heatmap, because it shows a good overall picture for the main categories, but for some it is showing the opposite trend. In such a case, what should I do?
So, are you saying that I use something like ORA?
no, walk me through your process and let's check how did you end up with different regulation. For visualization, you can check some ideas from this: https://yulab-smu.top/biomedical-knowledge-mining-book/index.html
I used featureCounts to get gene-level counts from RNA-seq BAMs, then ran DESeq2 to get log2FoldChanges and adjusted p-values. Then I took the full ranked list (by log2FoldChange), and ran GSEA using libraries like gseapy with GO_Biological_Process_2023 as the gene set.
In the DESeq 2 list, I could see many core transcriptional genes being significantly downregulated. But in the GSEA list, I could see the genes in that or related to that process being upregulated. I extracted the specific annotation from the GO_Biological_Process 2023, and turns out while it does show many genes being downregulated (which I saw earlier), it also has other genes in the process (not necessarily core, but important), many of which are upregulated, and I think that skews the GSEA ranking. Even if another category related/specific to transcription (since GO categories can overlap) does show as downregulated, it is not significant.
try log2FoldChange * -10logpvalue and if possible can you cross check it with the library in the link I sent earlier. How can you tell that they are upregulated in GSEA list? Are there more core transcriptional genes that are significantly downregulated than upregulated?
P.S. You can actually segregate the 2 sets into upregulated and downregulated genes (with set pvalue and log2foldchange threshold) and then test them seperately by overrepresentation. There you can see if the downregulated or upregulated genes are significantly overrepresented in your pathway of interest.
I believe you are right. My mistake! I am confusing NES with upregulation/downregulation, I think. It is a little confusing.