I have a list of GI or Accession numbers. Now I want to query the nr database to get xml files for annotation purpose. How do I do this?
The nr database is a collection of protein sequences. Are you interested in extracting the protein sequences associated you list of GIs? If so, you can use Batch Entrez (http://www.ncbi.nlm.nih.gov/sites/batchentrez). From the drop down list select protein. Upload your list and click retrieve.
I'm trying to use blast2go to annotate my sequences. Blast2go accepts xml files from the local nr blast. I'm just wondering if it is possible to get xml files with just GI numbers.
Hi @grayapply2009
I am still not very clear as to what you want to do. Did you run blast locally? What are the GI or accession numbers you refer to in the first post?
If you are running blast on the command line, you can generate the output as XML using the -outfmt 5
parameter. You can then feed that output to blast2GO.
Also, nr is not a good source of GO annotations. Perhaps start with a well annotated DB, such as SwissProt or Trembl.
If you are annotating trancriptomics data, see the very helpful Trinotate documentation https://trinotate.github.io/
seach this site for : NCBI efetch