Hello there !
I am in the process of developing a small web application as a part of an academic project to predict exons and introns in a given DNA sequence. I intend to do this for the eukaryotic sequences especially human. I have already written the basic exon finding program depending on start-stop codons taking into consideration the six possible reading frames etc.. Using BioJava, I have also added the BLAST(n and p) search on that given sequence through NCBI database. Now I want to make my program more technologically correct from the biological point of view. I further want to add the splice sites criteria of 5' GT 3' AG and want to predict CpG Islands as well. What other analysis can I add to make my gene predictor up to date? I would appreciate any suggestions from experts on the field whether it is a good plan for this small website. I am writing in MVC framework and using JAVA servlets.
Regards
Thanks JC ! You have provided with very useful sources which surely can be used. However I notice that the criteria authors have used on softberry for CpG Islands is classic and may be I can program little modified criteria published in latest papers.