How to download fasta sequence of my interest from NCBI?
1
0
Entering edit mode
12 days ago
Kumar ▴ 120

Hi all, I am trying to download the protein fasta sequence of "HSP listeria monocytogenes" from NCBI. Please suggest how I can download these all sequences using the command line from NCBI. I tried the following command, however it requires a file of accession number but since I have a keyword (HSP listeria monocytogenes) to search and retrieve the sequences in fasta format. Please help me to find the way in this regard.

epost -db protein -input <Accessions file>  | efetch -format fasta  > <Output file>
FASTA SEQ NCBI • 155 views
ADD COMMENT
0
Entering edit mode
12 days ago
GenoMax 107k

While this can be done via NCBI Entrez web search followed by a download, if you still want to use Entrezdirect then something like following would work.

$ esearch -db protein -query "HSP AND listeria monocytogenes [orgn]" | efetch -format fasta

Representative result (sequences truncated for space saving)

>sp|Q71Z71.1|RS4_LISMF RecName: Full=30S ribosomal protein S4
MARYTGPSWKVSRRLGISLSGTGKELERRPYAPGQHGPTQRKKISEYGLQQAEKQKLRHMYGLTERQFKN

>WP_052960683.1 30S ribosomal protein S4 [Listeria monocytogenes]
MARYTGPSWKVSRRLGISLSGTGKELERRPYAPGQHGPTQRKKISEYGLQQAEKQKLRHMYGLTERQFKN

>WP_031665574.1 30S ribosomal protein S4 [Listeria monocytogenes]
MARYTGPSWKVSRRLGISLSGTGKELERRPYAPGQHGPTQRKKISEYGLQQAEKQKLRHMYGLTERQFKN
ADD COMMENT
0
Entering edit mode

That is great! Thank you for your help.

ADD REPLY

Login before adding your answer.

Traffic: 3092 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6