9 months ago by
Republic of Ireland
can I consider those as a signature for the Cases_2 group?
Yes and no. It is 'tentative' evidence but, for now, it just says that you have identified a group of genes whose expression, when compared to controls, is different between Cases1 and Cases2. You will have to build more evidence to convince people that you have a 'signature', and of what would that signature be? - changes in expression due to the mutation? What were your thresholds for determining statistically significantly differentially expressed?
It would be good to see Cases1 vs Cases2, just purely out of interest.
You should also construct a binomial logistic regression model in order to build further evidence. This would be of the form:
glm(mutation ~ gene, data = mydata, family = binomial(link = 'logit'))
mutation would be encoded as
gene would be a continuous variable of gene expression, and
mydata contains Cases1 and Cases2 combined. This would provide more convincing evidence that a gene's expression was altered based on mutation status, but still not direct evidence that the mutation is the cause.
You could also perform a linear regression, but the interpretation changes, slightly:
lm(gene ~ mutation, data = mydata)
To use both regression models, the assumption would be that your input RNA-seq data has been normalised to adjust for biasing factors, and also transformed via log2(CPM + 1), variance-stabilised transform, or regularised log transform, or something else.