Entering edit mode
6.1 years ago
gtasource
▴
60
I have a species with an unannotated reference FASTA. I have gene coordinates that I pulled from NCBI from very closely related annotated species (each gene is 10,000 - 30,000BP long.) I want to map these genes to the unannoated reference FASTA so that I can have have coordinates for this new species. I've been using a combination of BLAST and BLAT, but it's been hard to figure out exactly the start and end of this gene, in the unannotaed species.
Any help would be appreciated!
Blast / Blat will give you only approximate (and potentially incorrect) information about the gene coordinates. Given the size of the genes, I take you are dealing with eukaryotes? Ideally, you should predict genes on the unannotated genome. You can do so using the genes (or proteins) of interest as templates to infer the gene structure in the unannotated genome - GeMoMa can do this, and it is even implemented as a Galaxy workflow.