Question

Enrichment in bacterial core gene

2

Entering edit mode

8.7 years ago

James Chan ▴ 20

Hi,

I have bacterial genome A and genome B, each with 4,000 and 4,1000 genes. After homologous clustering, I found 1,200 core genes present in both genomes. What I am interested is which GO term is enriched in the core genes of both genome A and B. I am just curious if I am going to calculate manually, using Fisher's exact test in 2x2 contingency table, how am I going to do this? Let say I want to calculate for "Iron transport". For my best understanding in Fisher's exact test, I guess the contingency table should look like this (for genome A):

                      Gene annotated with "Iron transport"       Gene not annotated to "Iron transport"
Core gene             50                                         1150
Non core gene         200                                        2600

I don't know is above a correct 2x2 contingency table to find the GO term (by dividing the gene set into core gene VS non core gene), because I saw from other posts, some suggested using whole gene set as background (or core gene VS whole gene set in a genome). I don't know which is the most correct and precise way to do this. I am confused since I am new to statistics and gene enrichment analysis.

I would appreciate if anyone can enlighten me about this question. Thanks in advance.

bacterial-core gene-enrichment • 1.7k views

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.7 years ago by James Chan ▴ 20