Question: How can I collapse transcript variants for summary figures, like metabolic pathway heatmaps ?
gravatar for reuscher.stefan
4.4 years ago by
reuscher.stefan0 wrote:

Dear All,

I have an RNA-seq experiment which I like to summarize and display using e.g. heatmaps or metabolic pathways. The reads were aligned to the rice reference transcripts which have identifiers Os01t12345.1, Os01t12345.2, ... where the .1 and .2 indicates transcript isoforms. Some genes come with 2 or more isoforms, but I would like to ignore that information entirely for know and just put one set of values for each gene to heatmaps/tables etc.

What would be an appropriate way to collapse those isoforms into one ? I can imagine averaging over isoforms, or just take the highest expressed isoform. How do other people process their data ?




sequencing rna-seq • 1.3k views
ADD COMMENTlink written 4.4 years ago by reuscher.stefan0

The answer to this will depend on how your expression metrics were counted to begin with. Of most importance is how multimappers were dealt with. If they were dealt with in a good way then you can collapse transcript->gene metrics by simply adding things together. If multimappers were counted multiple times then there's likely no legitimate summarization method and you'd need to reprocess things (N.B., you could probably do that with Salmon, which should be quite fast).

ADD REPLYlink written 4.4 years ago by Devon Ryan94k

Reads were mapped using bowtie1 and --all --best --strata, quantification I do not know since I did not do it myself. I will find that out.

Thank you for your insights, still. It is a starting point.

ADD REPLYlink written 4.4 years ago by reuscher.stefan0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1022 users visited in the last hour