I've put this off for a while and trawled biostars and bioconductor for a consensus for which there appears to be done.
The short question is I am exploring expression evolution across a variety of bird species (6-10 species) and need to normalise my read counts and use an appropriate expression metric for downstream comparisons. What are peoples recommendation?
I am getting my counts from paired end reads using Salmon, identifying orthologs from the CDS with BLAST and importing transcripts/merging to genes using tximport.
Everything I read tells me the TMM will bias based on differences in library compositions, gene lengths, gene content etc between species. Likewise, raw RPKM is also contentious. Brawand et al. (2011) use a median centring method on a subset of orthologs with conserved expression patterns. Likewise I have seen the use of zFPKM for clustering inter-species expression counts.
I realise this issue is still a bioinformatically complex problem, but thank you for any help that can be offered!