Hello,
I have association study results between proteins from SomaSCAN assay and cognitive ability. I want to perform enrichment and I am bit uncertain which to use as background. I have about 800 hits after bonferroni correction out of the 7200 total proteins which I have. When I set background to the total number of proteins (including the significant hits) which I used in the assay after filtering proteins out for quality control (e.g. mouse proteins, spuriomers), there is no enrichment after adjustment for bonferroni correction, however there are plenty <0.05 with no correction for multiple testing. When I do not set a background set of genes, I have a lot of enrichment after correction for multiple testing, and the biological processes which are enriched are similar to the ones I have when I do set the background to the proteins in my assay (but which do not survive multiple testing correction). I would appreciate any advice.
example of code
go_enrich <- enrichGO(gene=candidate_genes_map$UNIPROT,
universe=background_genes_map$UNIPROT,
OrgDb = org.Hs.eg.db, # Using human gene annotations
keyType = "UNIPROT", # Gene ID type (Entrez)
ont = "BP", # Biological Processes
pvalueCutoff = 0.05, # Set p-value cutoff for significance
pAdjustMethod = "BH")
Many thanks!
I would include the background to protect against "self-enrichment", so processes that are pre-enriched in your input even before the DE analysis. I personally ignore FDR in ORA enrichment analysis most of the time for the simple because databases such as GO and reactome can be that large that mt burden is killing everything. I look at raw pvalues and coverage and then interpret results based on this. ORA never proffs anything, it's hypothesis generation to follow up on, so it's fine doing it that way.