Question: how to caculate effective length for RNA sequencing?
1
gravatar for xiaoguang
13 months ago by
xiaoguang20
xiaoguang20 wrote:
Is there any R packages or programs could help me to calculate **effective length** of every isoform or gene?

We know the importance of  effective length when we calculate TPM for isoforms or genes . but we can not got it from featurecount program. Salmon can get it ,but it is only for isoforms.
rna-seq R • 635 views
ADD COMMENTlink modified 12 months ago by Gordon Smyth1.8k • written 13 months ago by xiaoguang20
1
gravatar for Gordon Smyth
12 months ago by
Gordon Smyth1.8k
Australia
Gordon Smyth1.8k wrote:

If you are doing a gene-level analysis, then I recommend you simply use the gene length values that come from featureCounts.

The concept of effective length is for transcripts rather than genes and is not so clearly relevant for an analysis of gene-level counts from featureCounts. However, if you wanted to modify the gene lengths in a way similar to how kallisto and Salmon modify the transcript lengths, it would simply be gene-length minus the average read length.

ADD COMMENTlink modified 12 months ago • written 12 months ago by Gordon Smyth1.8k
0
gravatar for kristoffer.vittingseerup
13 months ago by
European Union
kristoffer.vittingseerup3.4k wrote:

You cannot do that. The effective length requires isoform quantification (afterwards summing to get gene expression) which featureCounts cannot do. For featureCounts you will have to settle on FPKM normalization.

Why not just quantify with Salmon in the first place?

ADD COMMENTlink written 13 months ago by kristoffer.vittingseerup3.4k

If we use featurecount, we can get gens quantification beacuse of aligning sequences to genome, but we use Salmon, we only get isoforms quantification.

however,TPM is the percentage of FPKM,Are they different´╝č

ADD REPLYlink written 13 months ago by xiaoguang20

To get gene-level quantification from Salmon you simply sum the counts/TPM for all isoforms annotated as belonging to the same gene. This can fx easily be done with the tool tximport as described in the vignette. As an alternative you can use IsoformSwitchAnalyzeR by first using the "importIsoformExpression" and afterwards using isoformToGeneExp - which supports providing the gene-isoform relationship as a GTF file. You can read more about the advantages and disadvantages of such approaches here.

ADD REPLYlink modified 11 months ago • written 11 months ago by kristoffer.vittingseerup3.4k

And no TPM are not the percentages of FPKM (those are typically called PSI or Isoform-fraction (IF)). TPM is an abundance measure just like FPKM except it is better when using RNASeq data since it provides more accurate abundance measures. You can read more about RNASeq abundance units here.

ADD REPLYlink written 11 months ago by kristoffer.vittingseerup3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1775 users visited in the last hour