I have data that contains the occurrences of genes in different lineages:
Lineage Gene Gene_function 1 x regulatory proteins 1 p cell wall 1 y conserved hypotheticals 1 x respiration 1 z respiration 2 w cell wall 2 a cell wall 2 y regulatory proteins 3 b respiration 3 x conserved hypotheticals 3 a regulatory proteins 3 b regulatory proteins 3 z conserved hypotheticals 3 a respiration
How do I test if there are a significantly different number of, say, "cell wall" genes between all the lineages (I'm thinking equivalent to a classic ANOVA, followed by Tukey test to identify which specific lineages are different). N.b. there are a different number of rows for each lineage.
Then repeat this for each of the types of genes.
Is there a simple and quick way to do this in R?