Can my input list of genes for enrichGO analysis be a list with both- up and downregulated genes?
I think this depends on your biological question. Is it relevant for you to find enriched pathways in genes "altered" regardless of the direction of change in your experiment? If it is, go for it.
You can try both approaches and compare results. (You'll probably get more results by merging the up- and down-genes because of having more power, and maybe that's why you're asking. But the decision should ideally be because of biological reasons).
Thank you for the help! I wasn't sure would it be correct to give as an input merged up and down- list of genes, because of the way of how enrichGO function calculates scores. I know this is basic question, but I'm total beginner in this field. Thanks a lot!
AFAIK enrichGO basically does enrichment tests of the type of the hypergeometric test, so the only relevant parameters to test if your genes are enriched for any pathway are numbers: the number of genes you have and those which belong to the pathway, the size of your total universe of measured genes, the number of genes in the pathway.
There are other gene set analyses that do indeed take into account directionality, such as GSEA. For these, generally one inputs the genes accompanied by some measure of amount+direction of change, such as combinations of logFC and test statistics. This review is a bit outdated but can serve you to understand typical variations of gene set enrichment analyses. enrichGO would go under the "ORA" category (over-representation analysis).
I will build on barix's question. I would like to know which is the correct universe of genes to use in case of:
1) enrichment of genes altered regardless of the direction of change: is all genes used as input for statistical testing by the DESeq2 pipeline?
2) enrichment of genes altered in a particular direction: is all genes used as input for statistical testing by the DESeq2 pipeline or only those for which statistical significance was found?
Personally I would use as universe, for both cases, all genes used as input for statistical testing.
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy