Advice on designing an EdgeR object in complex experimental setup ?
2
0
Entering edit mode
3 months ago

Dear All, I have a tricky question regarding the use of EdgeR in R to run differential expression analyses of the following experiment: it is a multifactorial experiment with around 55 sample designed as follow: Conditions = 2 (treatment & Control) Breed = 4 breeds Sex of animals = 2 (male and Female)

Within each breed, I have samples from 5 individuals (A, B, C, D and E)

I have two main questions:

1. I know that I can adjust for additional variables upon DE analyses, but When running an nMDS looking at how treated subjects would seprats from control, I can not really visualize this, while adjusting for differences in gender or breed using the nMDS approach ... Are there any idea or technique so that when visualizing the clustering of treatment vs Ctr samples to adjust for other covariate like sex or breed in here ? I need to do this because if I see clustering, I am afriad that this could be due to sex-specific effect, and not necessarily a treatment effect.

2. If I am going to calculate DE genes between treatmeent & Ctr, I think I need to adjust for 1) base line differences between individuals (as this is a paired design because the same animals were blood treated and as a Control), 2) sex differences 3)Breed.

Could you advice, which of the designs I should follow from the updated guidlines of edgeR (https://www.bioconductor.org/packages/devel/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf) that could match my experiment ? I though about the one in section 3.4.1, but with more factors and adjusting for them.

Thanks a lot for any comments. ..

program R • 542 views
0
Entering edit mode
0
Entering edit mode

Thanks Sorry fpr bothering again : but wjy do you repeat the design again for MDS plotting ? and how combining individual and breed adjusts for sex ? Actually, I am still wondering if I compare between treatment vs control, there are impalance between male and female.

Could you also comment on why did not you perform normaliztion TMM at the beginning ? before designing the matrix ? Also, is the DE analyses is done on TMM counts or CPM counts of do I nned to run both normalization before designing the matrix and DE analyses ?

0
Entering edit mode
3 months ago

I am just posting here again what you advised on the Bioconductor website.

First, you need to make unique labels for all the individual animals, because currently the IDs are being recycled within breeds:

 Animal <- factor(paste(Individual,Breed,sep="."))


The design matrix for the DE analysis will be:

design <- model.matrix(~Animal+Status)


To see the treatment vs control effect in an MDS plot you can remove the animal effects. Assuming that y is the DGEList object (after filtering and calcNormFactors),

logCPM <- cpm(y, log=TRUE)
design.Status <- model.matrix(~Status)
logCPM.corrected <- removeBatchEffect(logCPM, batch=Animal, design=design.Status)
plotMDS(logCPM.corrected, label=Status)


Gordon Smyth I have tried the previous codes, it worked nice for the plotMDS, and I can clearly see difference. I just still have some questions:

First: Am I right that the object 'design' (that you suggested for DE analyses) is the same as the 'design status' that made for MDS after removing the batch effect ?

Second: For DE analyses Which count data and which design to be used for fitting the model given that I want to run this on counts adjusted for the Animal (breed + sex) ? for instance I will do this using the logCPM.corrected as a count matrix and 'design' as a design matrix

design <- model.matrix(~Animal+Status)
fit <- glmQLFit(logCPM.corrected,design)
qlf <- glmQLFTest(fit)


Could you kindely advise if I am right Gordon Smyth ?

0
Entering edit mode
3 months ago

Dear All, Just to follow up on this experiment:

Could you also advise me on what is the best method to determine the if treatment has an effect compared to control in particular breed, using the RNA seq data ? I actually thought about measuring the Euclidean distance between treatment and control animals, and look for average distance between them. Here I am going to use the count used for the MDS analyses that you told in the post (filtered CPM, TMM count) ? Am I right ? or shall I use the TMM filtered counts used for the DE analyses ?