Question: How should effective lengths returned by Salmon be collapsed for Differential Expression?
0
gravatar for arf1389
18 months ago by
arf138910
United States
arf138910 wrote:

Hi All,

I used Salmon to align a set of technical replicate fasta files to my reference transcriptome with the seq-bias and gc-bias corrections enabled.

I know that the tximport package reports the effective length vector should be processed the following way for use in edgeR:

cts <- txi$counts

normMat <- txi$length

normMat <- normMat/exp(rowMeans(log(normMat)))

library(edgeR)

o <- log(calcNormFactors(cts/normMat)) + log(colSums(cts/normMat))

y <- DGEList(cts)

y$offset <- t(t(log(normMat)) + o)

#y is now ready for estimate dispersion functions see edgeR User's Guide

What I am unsure of is...

If I collapse my technical replicates by the sum or the mean, how should I collapse the effective length vector returned from tximport? Should I take the mean of the effective lengths? The sum?

rna-seq salmon alignment • 862 views
ADD COMMENTlink modified 18 months ago • written 18 months ago by arf138910
1
gravatar for arf1389
18 months ago by
arf138910
United States
arf138910 wrote:

The answer is to this question is a feature that was added as of Salmon v 0.9.0.

Added the quantmerge command. This allows producing a multi-sample TSV file with aggregated abundance metrics over samples from many different quantification runs

This can be used to merge technical replicate count estimates and produce a new data set with the merged counts and effective lengths.

ADD COMMENTlink written 18 months ago by arf138910
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2205 users visited in the last hour