I used Salmon to align a set of technical replicate fasta files to my reference transcriptome with the seq-bias and gc-bias corrections enabled.
I know that the tximport package reports the effective length vector should be processed the following way for use in edgeR:
cts <- txi$counts normMat <- txi$length normMat <- normMat/exp(rowMeans(log(normMat))) library(edgeR) o <- log(calcNormFactors(cts/normMat)) + log(colSums(cts/normMat)) y <- DGEList(cts) y$offset <- t(t(log(normMat)) + o) #y is now ready for estimate dispersion functions see edgeR User's Guide
What I am unsure of is...
If I collapse my technical replicates by the sum or the mean, how should I collapse the effective length vector returned from tximport? Should I take the mean of the effective lengths? The sum?