Dear all,
I am struggeling with the statistics test using topGO.
I have searched on the net and questionned my colleagues but it looks like this kind of error doesn't occur so often so it's not really documented:
This test is ok : resultFisher_FAvsBHI_MF=runTest(GOdata_FAvsBHI_MF, algorithm = "classic", statistic = "fisher")
when I have the same message error for the two others:
resultWeigt01_FAvsBHI_MF=runTest(GOdata_FAvsBHI_MF, algorithm = "weight01", statistic = "t")
resultKS_elim_FAvsBHI_MF=runTest(GOdata_FAvsBHI_MF, algorithm = "elim", statistic = "ks")
" -- Weight01 Algorithm --
the algorithm is scoring 986 nontrivial nodes
parameters:
test statistic: t
score order: increasing
Level 14: 3 nodes to be scored (0 eliminated genes)
Level 13: 8 nodes to be scored (0 eliminated genes)
Level 12: 10 nodes to be scored (10 eliminated genes)
Level 11: 8 nodes to be scored (14 eliminated genes)
Level 10: 17 nodes to be scored (20 eliminated genes)
Level 9: 45 nodes to be scored (34 eliminated genes)
Level 8: 78 nodes to be scored (52 eliminated genes)
Level 7: 149 nodes to be scored (276 eliminated genes)
Level 6: 338 nodes to be scored (345 eliminated genes)
Level 5: 175 nodes to be scored (524 eliminated genes)
Level 4: 112 nodes to be scored (868 eliminated genes)
Level 3: 34 nodes to be scored (1124 eliminated genes)
Level 2: 8 nodes to be scored (1197 eliminated genes)
Level 1: 1 nodes to be scored (1394 eliminated genes)
Error in t.test.default(x = x.G, y = x.NotG, var.equal = var.equal, alternative = ifelse(aa, : not enough 'y' observations"
I also noticed that some people perform all these test with Fisher (maybe to avoid this kind of problem?) but didn't explain their choice.
It looks like it could be a p-value format error since the Fisher test can be done but not the two other tests. So I've checked that there is no NA values (set at 1) and also test with/without using the scientific annotation.
Please let me know if you need more details,
Thanks for your help,
Best regards,
...but how many input genes are in
GOdata_FAvsBHI_MF
?2808 genes, it should be fine shouldn't it?
Should be - yes. Can you show the other commands that you ran before
runTest
?Here is how I typically run GO enrichment via topGO (I had conveniently written this for a beginner, recently):
In fact we roughly do the same thing, just with another database that didn't exist per se, explaining why I had to merge 2 tables in oder to have a database to work with .
I think the enrichment is not bad at the end, it more looks like a problem of statistic test?
There are many different combinations of
algorithm
andstatistic
, and I admit that I have not fully explored these. Is the problem that your enrichment p-values are not reaching statistical significance (p<0.05)?Sorry I couldn't answer yesterday because I am new here and thus not allowed to post more than 6 messages per day...
Right, it is not clear for me which statistic should be used, the vignette says that ks and Fisher can be performed in that case... and it works with my data if I change the algorithm (weight01 or elim) but only if I choose the Fisher statistic, not the t nor the KS... And depending on the algorithm I use I got from 6 to 20 significative p-values < 0.05.
Each will have advantages in different contexts, just like p-value adjustment methods. If you are unsure, then just use the default values.
Thanks for your script Kevin, I'll study it now to try to find where the problem can be with mine ...
Here is the script I've used:
Sorry but the font is really erratic...
I have tidied it via the
101 010
buttonThanks for the trick!