Question: Error in Transcript to gene level estimation using "Genesum" package
gravatar for Pam
4.8 years ago by
Pam30 wrote:

I am using Genesum package for estimating gene abundnace from transcript abundance. As input, I give sailfish generated expression file "Quant.sf" and gene annotation "GTF" file. But i get the following error

**Parsing input expression file
terminate called after throwing an instance of 'std::invalid_argument'
  what():  stoi
Aborted (core dumped)**

Can someone help me solve this issue ??


rna-seq • 1.9k views
ADD COMMENTlink modified 4.8 years ago by Rob4.6k • written 4.8 years ago by Pam30
gravatar for Rob
4.8 years ago by
United States
Rob4.6k wrote:

Hi Pam,

The reason for this is that Sailfish (& Salmon) have since changed their default output format (actually, making them simpler to read with standard tsv parsing functions etc.), and I have not yet updated GeneSum to keep pace. However, I should say that I'd actually recommend tximport, which solves the same problem while offering some more options than GeneSum. Also, it's worth noting that Sailfish and Salmon now actually have built-in support for aggregating their expression estimates to the gene level (though I'd still probably recommend using tximport).

ADD COMMENTlink written 4.8 years ago by Rob4.6k

Hi Rob, Thanks for the calrification. Will try tximport. Thanks again.

ADD REPLYlink written 4.8 years ago by Pam30

Hi Rob, I have used tximport and now I have counts from each sample for DEG analysis. I need TPM values(summarised for gene) for downstream analysis like heatmaps etc. I enabled "scaledTPM" in tximport but i get in countsfromabundance column just "scaledTPM" string and not any values ??!!

ADD REPLYlink written 4.8 years ago by Pam30

Check ?tximport and scan to the section Value. This section describes the object that is returned by tximport().

The return object is a list which contains three matrices, one of which is "abundance", this is the TPM summarized to gene-level.

ADD REPLYlink written 4.8 years ago by Michael Love2.1k

Hi Michael, Thanks a lot. I am sorry. Yes !! it is clearly mentioned in your paper :-).

ADD REPLYlink written 4.8 years ago by Pam30

Hi Pam,

Tx-import only generates counts as its output. The "countsfromabundance" field just described the method used to generate the counts from the input abundances. That said, given counts, computing TPM should be very simple. The "length" of a gene should be represented as the abundance-weighted combination of the lengths of its isoforms, and the count provided by Tx-import gives the read count for the gene. Perhaps Mike Love (the main tx-import author) might even have a function to perform this transformation and get back gene-level TPM. I'll point him here on twitter!

ADD REPLYlink written 4.8 years ago by Rob4.6k

Thanks Rob for your clarification and pointing this to Michael.

ADD REPLYlink written 4.8 years ago by Pam30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1510 users visited in the last hour