Question: GSEA produces too few enriched sets
0
gravatar for crimsontabaq
5 months ago by
crimsontabaq40
Russia, Kazan
crimsontabaq40 wrote:

Here's a transcriptome of a non-model organism. Comparing two conditions, kallisto generated ~6000 differentially expressed genes. KEGG metabolic pathways of a relative organism were used to classify DEGs and check pathways enrichment. These categories are relatively small (3-100 genes/set). Whilst DEGs number is so high, sets with PADJ is quite small - 10-15 sets are truly enriched (padj = 0.05). Same situation is appearing when we applied Fisher test.

We're newbies in the field and it feels like we've missed something. What do we do wrong? Sorry if I've missed any details.

transcriptome gsea kallisto • 268 views
ADD COMMENTlink modified 5 months ago by kristoffer.vittingseerup350 • written 5 months ago by crimsontabaq40

Why is that wrong? What reasons do you have to expect having more gene sets enriched in your experiment? What I don't understand is how you used KEGG metabolic pathways of a relative organism "to classify DEGs and check pathways enrichment", first you classify DEGs based on your knowledge, and then you did a GSEA for each group of DEGs? Usually one test the enrichment of the genes in all the genes, independently if they are DEG or not.

ADD REPLYlink written 5 months ago by LluĂ­s R.570

We blastx'ed our transcripts against proteins of a relative organism, which genes are classified into KEGG pathways, so we can now group matching transcripts to these pathways. There's strong evidence for some groups to be enriched with DEGs based on previous experiments, but they ain't; also the states which are compared are radically different on a physiological level.

ADD REPLYlink modified 3 months ago • written 5 months ago by crimsontabaq40
0
gravatar for kristoffer.vittingseerup
5 months ago by
European Union
kristoffer.vittingseerup350 wrote:

One possible explanation is that you have "to many" differentially expressed genes in the sense that if 30-50% of your detected genes are differentially expressed it is very hard to have large enrichments. I would try using a more strict DE cutoff by for example filtering on the log2FCs.

ADD COMMENTlink written 5 months ago by kristoffer.vittingseerup350
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 971 users visited in the last hour