If you're certain that
... you can run Fisher's exact test in R like so:
First determine how many of the samples display each combination of mutations (A, B) (A-mut, B), (A, B-mut), (A-mut, B-mut)
# rows are A (unmutated, then mutated), columns are B (unmutated, then mutated)
counts <- matrix(
c(4500 - (90 + 550 + 810), # 3050 with no mutation
550, # (A-mut, B);
810, # (A, B-mut);
90 # (A-mut, B-mut)
), nrow = 2)
results <- fisher.test(counts)
Note that this is a two-tailed test: it determines if there are more comutated samples than expected, or less comutated samples than expected.
If you have prior reason to expect exclusivity between the two genes, you can do a one-tailed test:
fisher.test(counts, alternative = "less")
On the back-of-the-envelope, you could reason as follows:
4500 samples, ~ 15% are A-mut, ~ 20% are B-mut; so you'd expect around 3% of the samples to be comutated; but you've only observed 2%. To me, this doesn't sound too impressive, but since you've used so many samples, it is pretty impressive.
Are the samples all from the same dataset?
thanks a lot for the reply and the clear explanation. So, if the p-value is less than 0.05 it is seen as they are mutually exclusive. Am I right?
No. There is evidence that they are mutually exclusive, under the many many assumptions of the test.