Question: GO Analysis Clarification using GOstats using hyperGTest()
0
gravatar for brismiller
8 months ago by
brismiller10
Bellingham, WA, USA
brismiller10 wrote:

Hey everybody,

I have a question about how the hyperGTest() works with regard to the GO terms tested. From one of my results tables, some of the GO terms returned are not in my GO universe (the GO term is not in my organism's obo file).

For example, this GO term was shown to be significantly enriched but if I look in the universe it is not there.

"GO:0008483" %in% GO_Tet_universe$frame.go_id

[1] FALSE

Finally you should know that I am using the GO annotation file for Tetrahymena thermophila SB210, which presumable does not have every GO term annotated.

My question is, where are these GO terms coming from, and how are they being called enriched when there are no genes known for that term? From my understanding all parent GO terms have all the genes of their children, so is this why the terms are enriched, as their child GO terms are enriched?

First time doing a GO analysis, any type of help would be great

go rna-seq gene ontology • 377 views
ADD COMMENTlink modified 8 months ago by EagleEye6.2k • written 8 months ago by brismiller10
0
gravatar for EagleEye
8 months ago by
EagleEye6.2k
Sweden
EagleEye6.2k wrote:

It looks like geneOntology.org does not suppot “Tetrahymena thermophila” specifically anymore. But it uses “jcvi” (multispecies microbial annotation). Try using GeneSCF that can use current annotation from geneOntology.org.

Step1: Preparing database for your organism

./prepare_database -org=jcvi -db=GO

Step2: Performing enrichment analysis for your list of genes

./geneSCF -m=normal -i=INPUTgene.list -t=gid -db=GO_BP -o=/ExistingOUTPUTfolder/ -org=jcvi --plot=yes --background=15000
ADD COMMENTlink modified 8 months ago • written 8 months ago by EagleEye6.2k

Yes, there is no Tetrahymena specific file, but the file used to generate the universe for my analysis was downloaded from the geneontology.org's annotation download page with the filter "+ taxon_subset_closure_label: Tetrahymena thermophila SB210" to download all 34679 Tetrahymena annotations. Would this be any different than using what you proposed above?

ADD REPLYlink written 8 months ago by brismiller10

I suggested to check with different tool because it is better to verify that you get similar issues with other tools too. So that you will know whether GOstats has any issue processing this annotation (because GO term with no genes must not appear in the enriched list). Personally I always like to verify my results with more than one tool.

ADD REPLYlink written 8 months ago by EagleEye6.2k
0
gravatar for EagleEye
8 months ago by
EagleEye6.2k
Sweden
EagleEye6.2k wrote:

Hi, I have quickly checked with GeneSCF and term you specified is present in 'jcvi' (below is the result for the term). I guess (not completely sure) it is the problem with processing the annotation by GOstats (it looks like GO stats is trying to convert your gene names to entrez ids (or just number representation) and mapping to the annotation which doesn't have support for entrezid). That is the reason there are no genes in annotation.

GO:0008483 result from GO molecular function using GeneSCF prepare_database:

GO:0008483~transaminase activity    BA_1341,BA_2294,BA_2737,BA_2899,BA_3062,BA_3312,BA_3886,BA_4225,BA_4254,BA_4626,BA_4663,BA_4900,BA_5133,BA_5138,CHY_0011,CHY_1173,CHY_1436,CJE_0146,CJE_0882,CJE_1486,CJE_1514,CPF_0060,CPF_0325,CPF_0356,CPF_0707,CPF_0845,CPF_0911,CPF_1258,CPF_1623,CPF_1667,CPF_1720,CPF_2163,CPF_2212,CPS_0838,CPS_2054,CPS_2190,CPS_3232,CPS_3390,CPS_4612,CPS_4663,CPS_4878,DET_0576,DET_0739,GSU_0018,GSU_0084,GSU_0117,GSU_0162,GSU_1868,HNE_0095,HNE_0652,HNE_0889,HNE_1171,HNE_2243,HNE_2311,HNE_2357,HNE_2367,HNE_2507,HNE_2588,HNE_2594,LMOf2365_0306,LMOf2365_1615,LMOf2365_2132,LMOf2365_2341,MCA_0399,MCA_0598,MCA_1021,MCA_1491,MCA_2053,MCA_2125,MCA_2288,MCA_2997,PFL_0306,PFL_0754,PFL_1309,PFL_1609,PFL_1655,PFL_1824,PFL_1867,PFL_2045,PFL_2138,PFL_2406,PFL_2461,PFL_2868,PFL_3043,PFL_3219,PFL_3222,PFL_3354,PFL_3470,PFL_3521,PFL_4112,PFL_4152,PFL_4247,PFL_4362,PFL_4578,PFL_4657,PFL_4884,PFL_4949,PFL_5269,PFL_5681,PFL_5927,PFL_5960,PFL_6043,PSPTO_0096,PSPTO_1072,PSPTO_1440,PSPTO_1531,PSPTO_1779,PSPTO_1920,PSPTO_2136,PSPTO_5395,PSPPH_0218,PSPPH_0459,PSPPH_0862,PSPPH_1325,PSPPH_1931,PSPPH_2983,PSPPH_3631,PSPPH_4896,PSPPH_5053,SO_2483,SO_3497,SO_3789,SO_4343,SPO_0388,SPO_1136,SPO_1166,SPO_1295,SPO_1370,SPO_1401,SPO_1468,SPO_1567,SPO_1697,SPO_1916,SPO_2005,SPO_2024,SPO_2144,SPO_2589,SPO_2795,SPO_3027,SPO_3220,SPO_3230,SPO_3417,SPO_3471,SPO_A0113,SPO_A0352,SPO_A0354,VC_0392,VC_0748,VC_1184,VC_1625,VC_2309,VC_A0513,VC_A0523,VC_A0605,VC_A0824
ADD COMMENTlink modified 8 months ago • written 8 months ago by EagleEye6.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1886 users visited in the last hour