Gene prediction methods
2
0
Entering edit mode
3 months ago
Pratheep ▴ 130

Dear all,

My gene of interest is well established and also functions are predicted in other organisms. But particular gene not studied in Protozoan (Whole genome sequence available). We already try with conserved sequence search-based identification of particular gene in Protozoan, but don’t shows any similarity. Any other methods are available to predict the gene?

Gene prediction • 383 views
0
Entering edit mode

You could try an HMM-based method (e.g., build a profile and use Hmmer3) or a remote homology method (e.g., HH-suite).

0
Entering edit mode
11 weeks ago
benformatics ★ 2.6k

You could also try something like GeneMark http://opal.biology.gatech.edu/GeneMark/ (no idea how far the state-of-the-art has come... this may be outdated).

0
Entering edit mode
11 weeks ago

If homology (or more precise similarity) methods don't work which could very well be in this specific case (been there personally actually ;) ), you are left with two options:

• intrinsic methods: use gene prediction models to predict genes (HMM, IMM, ... based usually). You build a training set of correct genes and use those to build a model and then train a gene predictor to use them.
• RNAseq based: more straightforward than the intrinsic methods but not as complete. You will only be able to 'predict' genes where you have actual transcript/RNAseq data for.

of course you can still combine the two above approaches to come to an integrate result.

UPDATE: I just realized you are apparently only interested in a single (or few) gene(s). The above approaches are more applicable if you want to do a complete genome annotation as they (particularly the intrinsic method) are too labour intensive to do for a limited number of genes (though in the ideal case they should also get predicted in the whole genome approach).

In this case you indeed better work starting from known homologs. Did you ran a tblastn for instance with a query gene to verify that the gene is actually present in the genome?