What I've done so far: I've got RNAseq data from a non-model insect (ant) species and have aligned them to a genome assembly in STAR. This was fed into BRAKER2 (wrapper of GeneMark-ET and AUGUSTUS) to predict genes. With lists of nucleotide and amino acid sequences output from BRAKER of each predicted gene, I ran a BLASTn/BLASTp search respectively to find homologs. Now I have a list of genes/proteins homologous to the predicted genes/proteins in my species of interest. In this species there is a linkage group of particular interest that I wish to study, so I have extracted the relevant information of genes present in this region.
What I want to do: With this list of homologs I wish to search for gene enrichment in the region of interest compared to the rest of the genome, for ontologies, pathways, functions, etc. Though the list of homologs I have are of course from various different species.
Does anyone have any advice as to the best methods, software or databases that would suit this? Similar experiments, suggested pipelines, things to watch out for, etc. or anything to identify and further annotate an ant/insect genome.