Question: Hypergeometric test - Defining the gene universe
0
gravatar for RossCampbell
19 months ago by
RossCampbell140
USA/Frederick
RossCampbell140 wrote:

When doing a hypergeometric test for pathway enrichment, is there a generalized accepted way of defining the total "gene universe". I am debating two possible numbers: 1) the number of probes on the microarray that was used to generate the data in the first place, and 2) the total number of genes from the model organism used. Any thoughts on the most appropriate approach?

ADD COMMENTlink modified 19 months ago by Renesh1.6k • written 19 months ago by RossCampbell140

I would recommend you to consider the total genes that are detected atleast in one of your sample (microarray/RNA-seq).

ADD REPLYlink modified 19 months ago • written 19 months ago by EagleEye6.2k
2
gravatar for Renesh
19 months ago by
Renesh1.6k
United States
Renesh1.6k wrote:

In enrichment analysis, using a right background database is very critical for statistical analysis. The differences in gene background definitely affect your statistical significance (P-values) and ultimately biological inference.

If you use all genes from the genome, it will give highly significance P-values. Instead, if you use, only genes that define all of your pathway categories will give more robust and reliable results.

So if you are using a microarray for your analysis, then only use the genes that are represented on microarray chip as your background. It is recommended to not use all genes from the whole genome as reference background as it will give you more significant P-values.

ADD COMMENTlink modified 19 months ago • written 19 months ago by Renesh1.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1130 users visited in the last hour