I'm trying to do an differential expression analysis on RNA-seq data. I have data from controls, patients with a mutation in gene A, and patients with a mutation in gene B. Each of the patient groups have slightly different mutations in their respective genes and this has been known to cause a slightly different phenotype.
My overall aim to see how genes are differentially expression between each group and within the mutation groups, given that each point mutation is different. Since the data is real patient data and for obvious reasons I am unable to obtain biological replicates for each specific type of mutation, I've grouped the data by controls, patients with mutations in gene A, and patients mutations in gene B. I've accounted for the conditions and gender in my designs.
However MDS plots and PCA both show that the mutation groups aren't clustering (which is to be expected given the mutations in the genes are slightly different). I wanted to run glmLRT () from edgeR to perform a DE analysis between Control vs Gene A group, Control vs Gene B group and Gene B vs Gene A but I'm not sure if this is the best way to find what I'm looking for.
I would really like some advice on what differential expression pipeline would be the best for what I'm trying to do, or if glmLRT() from edgeR would suffice? I've been looking through previous posts on Biostars as well and haven't found anything. If I have missed anything, please do share the link!
TL;DR I have 3 groups: controls (7 biological replicates), group with differing mutations in gene A (3 samples, each with different mutation), group with differing mutations in gene B (4 samples, each with different mutation) and no biological replicates for the mutations. What is the best design and pipeline to perform DE analysis given that each mutation is different?