I have TRANSFAC data of number of transcrption factor binding sites(TFBS) for each gene.
I have ~1100 Transcription factors (TFs) and 2 sets of genes: 18 genes belonging to Pigmentation AND 5 genes belonging to house keeping.
Pigmentation genes House Keeping genes G1 G2 G3 G4 ........ G18 G1 G2 G3 G4 G5 TF1 TF2 ... ... TF1100
SO, I have the data of number of binding sites of each TF on each gene and I want to find which of the 1100 TFs have more binding sites on pigmentation genes than HK genes?
What statistical analysis should I use for such data?
As 18 genes belong to pigmentation or 5 genes belong to HK group, they are neither replicates? So, I can not use ttest right?
Also I checked the distribution of binding sites for each TF on pigmentation and HK groups and some have normal distributions. So I think I cannot use parametric tests.
Should I use Fischer's exact test (m x 2) ? Which other test can be used for such data?