For organism of choice: A sea anemone (Exaiptasia), there are two versions of the genome available. The original published genome and the NCBI based genome.
The NCBI version has a reduced number of mRNAs and predicted peptides (~2000 less) compared to the original published genome files. I'm aware that when raw files are uploaded, NCBI run their Eukaryotic Genome Annotation Pipeline (splign, pro-splign, Genomon) and provide an 'updated' / 'their version' of the genome? I've also noted their are 5 'new' genes which the NCBI version has and the old genome doesn't when I use grep to compare presence and absence of geneIDs.
Come to doing RNA-seq alignment, I would assume it's best to use the most up-to-date NCBI version rather than the original published genome?
Sorry if this is a 'noob question' to ask.