I am doing de novo transcriptome assembly of RNA-Seq data from two closely-related diploid species (mammals) for the purpose of identifying genetic variations between the two species. In order to do this, I suppose I need to identify pairs of ortholog transcripts between the two assemblies, so that I can compare them. What is the best way to do this? Should I simply do all pairwise alignments and pick out the pairs that are best matches to each other? Are there tools available for this already?
Additionally ,how does the presence of heterozygous SNPs affect the strategy? I am using Trinity for the transcriptome assembly, and my understanding is that when a transcript has a heterozygous SNP, Trinity will end up reporting two complete contigs that are identical except for the SNP. For example, if the transcript is "TTTTTTTTTT" and there is a heterozygous A/T at position 6, then Trinity would report "TTTTTTTTTT" and "TTTTTATTTT". This could potentially complicate the identification of ortholog pairs by a "mutual best match" strategy described above.