I've to analyze several RNA-Seq samples. I've samples from several runs, unstraned and straned, and several samples sequenced multiple times ( using different library kit ). I used htseq-count to have the read counts and want now to use DESeq to check for differential expression. So I've biological replicates and technical replicates (same sample sequences several times using a different lib kit. Is that correct ?).
So I did a design matrix. In my example, A.1 means sample A, sequencing 1. A.2 : sample A, sequencing 2,... So A is sequenced two times (One unstranded, one stranded), B three times (One unstranded, two stranded), C one time (one unstraned) and D one time (one stranded). ReplicateGroup is used to put together technical replicates.
designTable : Sample Condition Stranded ReplicateGroup A.1 Ctrl No A B.1 Treated No B C.1 Treated No C A.2 Ctrl Yes A B.2 Treated Yes B B.3 Treated Yes B D.1 Treated Yes D
After that I use DESeq. countTable is the read count matrix.
cdsFull = newCountDataSet( countTable, designTable ) cdsFull = estimateSizeFactors( cdsFull ) cdsFull = estimateDispersions( cdsFull )
But now I don't know how to fit a model on "condition" "stranded" and "replicateGroup".
like that ?
fit1 = fitNbinomGLMs( cdsFull, count ~ Condition + Stranded + ReplicateGroup ) fit0 = fitNbinomGLMs( cdsFull, count ~ Condition ) pvalsGLM = nbinomGLMTest( fit1, fit0 ) padjGLM = p.adjust( pvalsGLM, method="BH" )
Is it the good way to analyze technical replicated. I read that I have to merge them together.. but I don't think it's a good idea due to the fact that I use different library kits. So I'm stuck...
Thanks a lot in advance