I am working on comparative transcriptome of two related plant species, A and B. For each species, I have 3 control and 3 treated plants, and all samples contain reads between 15-25 millions. I have a genome and an annotation for species A, but not for species B. I finished Trinity assembly for species B, and I highly appreciate if you can throw light on my next moves, especially normalization and differential expression analysis. I am thinking of two approaches, but I am not sure which one is more appropriate.
1) I try using Transdecoder to predict a peptide for each transcript of species B, run Orthofinder (or similar) to find orthologs between species A and B, and use the information to create a master set of transcripts for the two species for abundance estimation/differential expression analysis. However, since Orthofinder is based on amino acid, I am not sure whether it will create a problem (in case of synonymous codons) when I try to map reads to the master transcripts for abundance estimation.
2) Skip the master transcript, I run DE analysis for each species individually and later use TransDecode/Orthofinder to find orthologs. This option sounds easier to me, but I don't know exactly whether it is possible to say 'an expression of gene XXX in species A is higher than in species B' since the normalization was done separately for each species.
Thank you very much in advance for all comments and suggestions.