Question: Finding The Gene Annotation Output From Web Blast In The Xml Etc. ?
gravatar for bjorn.wesen
8.4 years ago by
bjorn.wesen10 wrote:


I'm new to bioinformatics and am still learning how all the public databases interconnect so please bear with me :)

I have de novo assembled a set of bacterial genomes, located and extracted putative ORFs and want to do a very simple overview of "my" genomes compared to the refseq genomes for this bacteria. I have used discontinuous megablast on the NCBI webpage to compare all extracted genes and got a nice result, and using the web-interface I can select the matched genes and look at the alignments and see putative annotations of matching or overlapping "known" genes. What I mean is, the stuff like "flagellar motor protein".

The problem is when I try to download the results from the web-page I lose the annotations. They are simply not in the XML or ASN anywhere, but they obviously are on the web-page. So my question is, from which database and how did the NCBI BLAST result web-page extract this information? I want to integrate that step in my pipeline, so I'd rather not use blast2go or some other big program for it, but I could write some python. There are so many different id numbers associated with the results but I can't find any single one that maps to this information.

I'm aware of that these annotations should be taken with a huge grain of salt and that they can be misleading etc etc. I just want some cursory glance at the data that is more interesting to look at than 1500 base start/stop numbers.

I do have the blast+ suite locally and the refseq_genomic db as well. Maybe it is possible to look this info up using some of those blastdb commands?


gene annotation blast • 4.1k views
ADD COMMENTlink modified 7.6 years ago by Lhl730 • written 8.4 years ago by bjorn.wesen10
gravatar for Lhl
7.6 years ago by
United States
Lhl730 wrote:

this site has information you want.

ADD COMMENTlink written 7.6 years ago by Lhl730
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1417 users visited in the last hour