Question: edgeR/csaw design matrix
4 weeks ago
dcheng10 wrote:

I used csaw package which is based on edgeR's statistical method. So I need to have a design matrix. The user guide gives an example of analysing 2 groups as below:

bam.files <- c("es_1.bam", "es_2.bam", "tn_1.bam", "tn_2.bam")
design <- model.matrix(~factor(c('es', 'es', 'tn', 'tn')))
colnames(design) <- c("intercept", "cell.type")

However, I want to analysis 3 groups, so below is how I set the design matrix:

bam.files <- c("heart.bam", "liver.bam", "muscle.bam")
design <- model.matrix(~factor(c('heart', 'liver', 'muscle')))
colnames(design) <- c("intercept1","intercept2", "cell.type")

Then I tried to find differentially bind regions:

y <- asDGEList(
y <- estimateDisp(y, design)

And I got error:

**Warning message:
In estimateDisp.default(y = y$counts, design = design, group = group,  :
  No residual df: setting dispersion to NA**

Any advices would be greatly appreciated!!

rna-seq chip-seq R
4 weeks ago
h.mon24k wrote:

edgeR needs biological replicates to estimate dispersion, and you don't have replicates, as you have one sample per tissue. This is a poor experimental design and is not recommended. Similar questions have been asked before, please search the site - here are just two examples:

If you read a good number of them, you will notice answers in more recent threads are more adamant about not performing experiments without biological replicates, there are two reasons: 1) there is a sizeable literature exploring sample sizes and power for RNAseq, and the bare minimum recommended is three biological replicates per treatment, 2) in the beginning, RNAseq costs were higher, and properly-powered experiments were very expensive to perform, but nowadays costs are significantly lower.

Thanks! This helps! Will search before ask questions next time!

