Differential Gene Expression with Replicates for some of the samples
1
0
Entering edit mode
2.0 years ago
rezaeir75 ▴ 40

In some specific situations, it is neither affordable nor feasible to have replicates for large RNA-seq analysis. Moreover, the analysis purpose is just hypothesis generation for further experiment design. So, is it possible to have duplicate samples for one or two of the groups and use one sample for other groups? Could we estimate a constant dispersion from these replicates and apply it to all of the samples for differential gene expression analysis.

I'm using EdgeR/Limma for my differential gene expression analysis. When there are no replicates the eBays function gives me an error. However, if I add replicate to just one of the samples, it won't give any errors. Does this mean that it uses the calculated dispersion from the duplicate condition for all of the other one sample conditions? Or is it a bug that it doesn't give an error?

Should I consider this method of using duplicates for some samples instead of just using no replicates in such large experiments that are being done for hypothesis generation?

Cross-posted here

RNA-Seq edger limma deseq2 • 1.0k views
0
Entering edit mode

What do you mean by duplicate? Do you mean making a pretend bio replicate when you really don't have one?

0
Entering edit mode

I mean having two biological replicates for one condition. Not exactly pretending that would be using something like bootstraping which is good for techincal replicate generation. My intention is to get the average variabilty from the few duplicate samples and then assume that my other samples also have the same variability if they have been done in duplicates. In that case it is like having biological replicates for them too and using all the samples for DEG analysis.

0
Entering edit mode
2.0 years ago
Ali T. A. ▴ 30

there is some recommendations on what to do while using edgeR without (enough) replicates. Please see this documentation: https://www.bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf. Page 23-24.

0
Entering edit mode

I also read that. But the explanation is not complete. Like it doesn't say how to find these housekeeping genes. About its first case of using a common fixed dispersion. I guess what I'm asking here is like a more reliable measure of this common fixed dispersion. I say instead of using a theoretical dispersion why not try to estimatr dispersion from the few duplicate samples and then use it for all the samples. I'm asking if someone has already done this thing and I'd like to know about their experience and how reloable could thing method be