I have an RNA-seq experiment which I like to summarize and display using e.g. heatmaps or metabolic pathways. The reads were aligned to the rice reference transcripts which have identifiers Os01t12345.1, Os01t12345.2, ... where the .1 and .2 indicates transcript isoforms. Some genes come with 2 or more isoforms, but I would like to ignore that information entirely for know and just put one set of values for each gene to heatmaps/tables etc.
What would be an appropriate way to collapse those isoforms into one ? I can imagine averaging over isoforms, or just take the highest expressed isoform. How do other people process their data ?