I was running local blast command for transcripts fasta sequences as below:
blastx -query input/file.fasta -task blastx-fast -db blast-db-nr/nr -out output/file_blast_results.txt -evalue 0.001 - max_target_seqs 1 -num_threads 30 -outfmt '6 qaccver saccver pident length evalue qstart qend sstart send staxid ssciname scomname sblastname' > blast.log 2>&1&
I downloded nr database from ftp ncbi site and indexed it by makeblastdb script. As a results I have a list of accession numbers, percent of identity, length etc, the last descriptions are all NA.
My questions are:
How can I get organism name, description of protein ect. from list of accession numbers like WP_083411507.1, CBW15324.1.
I am observing that I have blast results for pig protein while my experiment include only bacteria - how can I select only prokaryotic nr part of database?
In next step I would like to assign GO numbers to blast results, any idea how to do that?
Many thanks for any suggestions,
PS. Input include ~3000 nucleotide sequences.