I ran a local blastp on the nr database from NCBI and got 100,000 hits. I organized the ones I wanted to keep in excel and I have a text file of all of their headers/description lines. How do I use what I have to get all of the actual sequences from NCBI? This may be a batch entrez thing, or it may possibly be the exact opposite...either way I figured this is a very common issue people deal with but I couldn't find a concrete solution.
You use the identifiers you are interested in and query nr database using a tool called
blastdbcmd that is included in blast+ package.
Put your identifiers (one on each line, use Accession #) and
-entry_batch id_file option with blastdbcmd.
Your command would look something like:
blastdbcmd -db /path_to/nr -entry_batch Acc_ID_file -outfmt '%f' -out sequence_file