Hi,
I have data from a RNAseq experiment, where different genotypes and treatments were tested. I am analyzing the data using edgeR (fit <- glmFit(dgeObj, design))
The design is as follows, where a, b, c and d are the different factors for the comparison, and 1-8 are the samples. So sample 1,2 and 3 are biological replicates, and so are 5,6, and 7. Factors b and d only have 1 biological replicate.
> design <- data.frame(a=c(1,1,1,0,0,0,0,0), b=c(0,0,0,1,0,0,0,0), c=c(0,0,0,0,1,1,1,0), d=c(0,0,0,0,0,0,0,1))
> design
a b c d
1 1 0 0 0
2 1 0 0 0
3 1 0 0 0
4 0 1 0 0
5 0 0 1 0
6 0 0 1 0
7 0 0 1 0
8 0 0 0 1
This is the contrast I want to make:
> contrast <- makeContrasts(a - b - (c - d), levels=colnames(design))
> contrast
Contrasts
Levels (a - b) - (c - d)
a 1
b -1
c -1
d 1
So I want to compare samples that have 3 biological replicates to samples that only have 1. Is this possible?
I don't understand how edgeR gives statistics for the coefficient of interest despite the lack of replicates (how was variance estimated there?). What are your thoughts on this?
Thanks!
I don't have a helpful answer, I am in a similar situation with RNAseq data where out of 7 samples of 3 reps each, I have a total of 3 reps with problematic data. Not sure if I need to go down to 2 replicates for each set or if I can make some comparisons using different numbers of replicates.
Please use
ADD REPLY
orADD COMMENT
and not the answer field. Please open a new question, providing the necessary details about the experimental setup and what you mean by "problematic data" so people can understand the issue you face. Thank you.