I have a question regarding testing the independence of two categorical variables in biology. I’ll first explain it in biological terms and then more generally.
I have a list of down regulated loci and a list of binding events of a transcription factor. How can I test whether my TF affects the regulation of the loci?
In more general words there are windows in the genome that are either classified as yes or no for two categories. I want to test whether one category affects the other. However, in the contingency table most of the windows will have both a no no.
At first I thought I would treat this as a chi squared test but I always get that we can reject the null hypothesis of independence of the categories. Since I think there is always a really large no- no category. Here's a sample of what I see:
loci no yes
no 2510067 1070
yes 2736 4
Any suggestions on how to deal with this?