Would it be appropriate to simply sum FPKM/RPKM/TPM normalized data across transcripts to generate a gene level estimate? Each transcript will be abundant at a certain % of total expression, so the intergration should simply be the gene level expression of all associated transcripts. Am I correct or am I missing something?
If by transcripts you mean isoforms, then i think it is ok. I was wondering the same so i took a look at Trinity's align_and_estimate_abundance.pl . There is an option to give a "transcripts to gene map" and obtain a count matrix at the gene level. As I understand from the script, this uses either kallisto_trans_to_gene_results.pl or eXpress_trans_to_gene_results.pl . This last one has the statement: "compute it based on the formula: FPKM_gene = FPKM_isoA + FPKM_IsoB + ... "