Hello,
I am trying gene prediction.
First, I created a gff file of the predicted gene using the HISAT2/StraingTie/TransDecoder/ pipeline using the RNAseq data.
The Transcriptome data obtained at that time showed a mapping rate of 60% with Salmon against another RNAseq. However, I felt that the number of genes was too large to satisfy the data.
Next, Salmon was applied to the RNAseq data using the published EST Transcriptome data (110000 entries). And I got 90% mapping rate. I find this data very attractive to me.
Therefore, I would like to put the EST data on the genome almost perfectly and use it as a provisional genome prediction. What would be the best approach in this case?
Here's what you've already done: I'm evaluating alignments with Salmon's mapping rate. GMAP:70%, number of genes was too large Augustus: 20%, number of genes was suit for previous research.
Thanks.