Publications To Understand Statistics That Could Be Used For Finding Over And Under Represented Terms From A Annotation
2
2
Entering edit mode
13.4 years ago
Pradeep ▴ 70

I have curated list of proteins. Each of these proteins are annotated with several keywords.

I request anybody from the community to suggest me papers/book which describes(in easy way) statistical methods that could be used to find over-represented and under-represented keywords that are associated with the list of proteins? Further if any material ( pseudo code, API's or libraries) that could help to implement that in java.

Thank you, --Pradeep

protein gene java • 2.5k views
ADD COMMENT
3
Entering edit mode
13.4 years ago
User 59 13k

You could also check this Bioinformatics paper:

Motivation: A number of available program packages determine the significant enrichments and/or depletions of GO categories among a class of genes of interest. Whereas a correct formulation of the problem leads to a single exact null distribution, these GO tools use a large variety of statistical tests whose denominations often do not clarify the underlying P-value computations.

Summary: We review the different formulations of the problem and the tests they lead to: the binomial, χ2, equality of two probabilities, Fisher's exact and hypergeometric tests. We clarify the relationships existing between these tests, in particular the equivalence between the hypergeometric test and Fisher's exact test. We recall that the other tests are valid only for large samples, the test of equality of two probabilities and the χ2-test being equivalent. We discuss the appropriateness of one- and two-sided P-values, as well as some discreteness and conservatism issues.issues.

ADD COMMENT
0
Entering edit mode

interesting reivew! thanks Daniel

ADD REPLY
0
Entering edit mode
13.4 years ago

FatiGo is a tool to detect GeneOntology enrichment in a list of gene ids. You should read their publication and all the references cited.

ADD COMMENT

Login before adding your answer.

Traffic: 2807 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6