I have following statistical data from a set of samples (healthy and diseased tissue - for approximately 100 individuals). For each sample:
- gene expression in healthy and diseased tissues (as 1-healthy only, 2-diseased only, 3-healthy and diseased)
- for a set of mutations (M1, M2, M3....) - for each mutation whether the sample has that mutation or not (as 1 or 0).
I want to analyze the link between the mutation and the expression of the gene:
- for each mutation, is there a relation between that particular mutation and gene-expression in healthy/diseased tissue (positive or negative)? For instance, is the mutation leading to increased (or decreased) expression in the diseased tissue?
- if a group of mutations together have an impact on gene-expression?
I am wondering what are the appropriate statistical tests for analyzing the two cases. I was considering Wilcoxon Test/Paired T-test for 1. Is that the right approach?
and for 2, I was considering using logistic regression. Would that work?
Any advice or pointers would be greatly appreciated.
Thanks in advance.