Hey all,
I am new to the transcriptomics world, therefore I have some questions. I am currently working on a study where the goal is to compare transcriptomes across 5 species. I mapped all rna-seq to reference genomes (different for every species) using Hisat2, then ran stringtie, and got read counts using Salmon. I have a hard time understanding what is the best approach to find differentially expressed genes between species when there are 5 different reference genomes..
What I've done so far was creating fasta files from the bam and gtf files, running transdecoder and then using filtered longest ORFs to run orthofinder with a model organism. I've now created a matrix for all the single copy orthologues and corresponding read counts for every individual. So my question is if this is actually an appropriate way to analize data; and if so - how could I go about normalizing read counts so that the samples are comparable?
Thank you!