Question: how to caculate effective length for RNA sequencing?
gravatar for xiaoguang
9 weeks ago by
xiaoguang10 wrote:
Is there any R packages or programs could help me to calculate **effective length** of every isoform or gene?

We know the importance of  effective length when we calculate TPM for isoforms or genes . but we can not got it from featurecount program. Salmon can get it ,but it is only for isoforms.
rna-seq R • 138 views
ADD COMMENTlink modified 5 weeks ago by Gordon Smyth980 • written 9 weeks ago by xiaoguang10
gravatar for Gordon Smyth
5 weeks ago by
Gordon Smyth980
Gordon Smyth980 wrote:

If you are doing a gene-level analysis, then I recommend you simply use the gene length values that come from featureCounts.

The concept of effective length is for transcripts rather than genes and is not so clearly relevant for an analysis of gene-level counts from featureCounts. However, if you wanted to modify the gene lengths in a way similar to how kallisto and Salmon modify the transcript lengths, it would simply be gene-length minus the average read length.

ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by Gordon Smyth980
gravatar for kristoffer.vittingseerup
9 weeks ago by
European Union
kristoffer.vittingseerup2.3k wrote:

You cannot do that. The effective length requires isoform quantification (afterwards summing to get gene expression) which featureCounts cannot do. For featureCounts you will have to settle on FPKM normalization.

Why not just quantify with Salmon in the first place?

ADD COMMENTlink written 9 weeks ago by kristoffer.vittingseerup2.3k

If we use featurecount, we can get gens quantification beacuse of aligning sequences to genome, but we use Salmon, we only get isoforms quantification.

however,TPM is the percentage of FPKM,Are they different´╝č

ADD REPLYlink written 9 weeks ago by xiaoguang10

To get gene-level quantification from Salmon you simply sum the counts/TPM for all isoforms annotated as belonging to the same gene. This can fx easily be done with the tool tximport as described in the vignette. As an alternative you can use IsoformSwitchAnalyzeR by first using the "importIsoformExpression" and afterwards using isoformToGeneExp - which supports providing the gene-isoform relationship as a GTF file. You can read more about the advantages and disadvantages of such approaches here.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by kristoffer.vittingseerup2.3k

And no TPM are not the percentages of FPKM (those are typically called PSI or Isoform-fraction (IF)). TPM is an abundance measure just like FPKM except it is better when using RNASeq data since it provides more accurate abundance measures. You can read more about RNASeq abundance units here.

ADD REPLYlink written 4 weeks ago by kristoffer.vittingseerup2.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 715 users visited in the last hour