Question: How should effective lengths returned by Salmon be collapsed for Differential Expression?
gravatar for arf1389
2.0 years ago by
United States
arf138910 wrote:

Hi All,

I used Salmon to align a set of technical replicate fasta files to my reference transcriptome with the seq-bias and gc-bias corrections enabled.

I know that the tximport package reports the effective length vector should be processed the following way for use in edgeR:

cts <- txi$counts

normMat <- txi$length

normMat <- normMat/exp(rowMeans(log(normMat)))


o <- log(calcNormFactors(cts/normMat)) + log(colSums(cts/normMat))

y <- DGEList(cts)

y$offset <- t(t(log(normMat)) + o)

#y is now ready for estimate dispersion functions see edgeR User's Guide

What I am unsure of is...

If I collapse my technical replicates by the sum or the mean, how should I collapse the effective length vector returned from tximport? Should I take the mean of the effective lengths? The sum?

rna-seq salmon alignment • 1.1k views
ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by arf138910
gravatar for arf1389
2.0 years ago by
United States
arf138910 wrote:

The answer is to this question is a feature that was added as of Salmon v 0.9.0.

Added the quantmerge command. This allows producing a multi-sample TSV file with aggregated abundance metrics over samples from many different quantification runs

This can be used to merge technical replicate count estimates and produce a new data set with the merged counts and effective lengths.

ADD COMMENTlink written 2.0 years ago by arf138910
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 689 users visited in the last hour