Doublechecking enrichment of promoters of DE genes
1
0
Entering edit mode
3 months ago
Aspire ▴ 300

I have performed an enrichment analysis over the promoters of genes that are differentially expressed between two conditions, and there were highly significant results. However, I have seen that random promoters also give significant enrichment results.

Hence, I want to doublecheck that the results I get are different from the enrichment signature of random promoters.

To get a random list of promoters :

I have downloaded the list of all human genes (GRCh38.p14) from Biomart. From these list, I have taken a subset 2000 genes (simply by taking the first 2000 genes by alphabet).

I have selected the promoters for these 2000 genes via EPD, with default settings (no options selected).

https://epd.expasy.org/epd/EPDnew_select.php

3264 promoters were selected, and I have exported a fasta file from -1000 to +100.

This was uploaded to SEA (Simple Enrichment Analysis) from meme-suite.

The results are here https://meme-suite.org/meme//opal-jobs/appSEA_5.5.517041990574241307765695/sea.html

For example Motif

Upon visual expection, there is high concordance between my list, and the random list found motifs.

** If you know the tools, is the default parameters selection reasonable?

** How do I reasonably decide which motifs enriched in my own data are valid, and which are no better than the enrichment results for random genes?

motifs enrichment • 507 views
ADD COMMENT
2
Entering edit mode

As others have commented, this is an impressive example why background are critical in enrichment analysis. You're currently testing promoter vs genome which of course primarily returns bona fide promoter motifs. Here as background you could use promoters of genes with good evidence to be not differential.

ADD REPLY
3
Entering edit mode
3 months ago

Generally you want to use a tool like MEME with your background model being all plausible promoters (potentially leaving out the experimental set). This will help reduce the occurrence of motifs common to most promoters. You can also try the differential motif option comparing your experimental set against the set of all plausible promoters (optionally minus your experimental set).

ADD COMMENT
3
Entering edit mode

Specifically in this case, you want to select "User Provided Sequences" under "Select the type of control sequences to use". And then upload a file of background promoter sequences under "Input the control sequences". As a starting point, you could use the list of random promoters you downloaded if you can't think of anything better.

ADD REPLY
2
Entering edit mode

Personally, I would use the promoters from the genes that were not differentially expressed but ARE expressed in the model of interest. Depending on the treatment, should give ~3000-7000 genes.

ADD REPLY

Login before adding your answer.

Traffic: 1644 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6