I am trying to figure out how DESeq2 calculates its log2FC measure - we see a strange pattern in the FCs in our data and I'd like to reproduce them by hand from scratch to make sure this pattern does not reflect an error with my pipeline.
Right now, I'm doing this:
deseqOutput<-DESeq(data_collapsedTechnicalReps) estSizeFactors <- estimateSizeFactors(deseqOutput_DUP) RLEnormedData <- data.frame(counts(estSizeFactors, normalized=TRUE)) meanOfRLECounts <- data.frame( rowMeans( RLEnormedData[,1:2]) , rowMeans( RLEnormedData[,3:5]) ) # here, condition 1 = cols 1 and 2, condition 2=cols 3,4,5 colnames(meanOfRLECounts)<-c('Condition1','Condition2') meanOfRLECounts$log2FC <- log2(meanOfRLECounts$Condition1/meanOfRLECounts$Condition2)
Here, condition 2 is the wild type condition - i.e. the samples that I indicate when I use relevel().
Can anyone spot what I'm doing wrong? What data does DESeq2 use to generate its log2FC estimates?