Getting species name after doing blastx?
3
0
Entering edit mode
7.8 years ago
seta ★ 1.8k

Hi everybody,

I'm wondering if there is a way to get species name after doing blastx. I performed blastx with outfmt 6 (tabular), but forgot to add "sscinames scomnames" flag to the output, if I say correctly, I should be wrote 6 std sscinames scomnames as output format. Since doing blastx is really time-consuming, please let me know how I can obtain species name using the existing output?

blast RNA-Seq sequencing alignment • 2.5k views
2
Entering edit mode
7.7 years ago
arnstrm ★ 1.8k

If you performed blast against the NR (or any NCBI) database that have GI ids, then you can easily get the taxonomy information by extracting it from taxonomy dump file (ncbi). If you are using custom database then it will be little difficult.

See this link (second section) for more details.

0
Entering edit mode

Thanks a lot for your feedback, I made local database from protein sequences available in the Uniprot. Would you please help me how I can extract taxonomy information (species name, common name) from blastx output (preferably tabular format of output) in these situation when we don't use NCBI database?

Thanks

1
Entering edit mode
7.7 years ago
pawlowac ▴ 80

Blastx results should contain accession numbers. Extract those (using tabular output, I think you can open in excel and copy and paste the column with accessions) to a text file (1 per line) and submit it to batch entrez. From there, use the 'Sent to' option to download summary files for every accession. http://www.ncbi.nlm.nih.gov/sites/batchentrez

0
Entering edit mode
7.7 years ago

You can use MEGAN5 to process the blastx result file: http://ab.inf.uni-tuebingen.de/software/megan5/ It can work from gi numbers and displays taxids and scientific names.

Alternatively if you want to write some scripts here are the mapping files: ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/ (check the readme files)