Question

How to download specific sets of genes from different taxonomic groups from ncbi?

0

Entering edit mode

7.7 years ago

Dramapeach • 0

Hello, I need to download sets of specific genes (their nucleotide sequences) from different taxonomic groups from ncbi. What is the best way to do it quickly? By the way, I have list of Gene IDs but can't download fasta file througt Batch Entrez. Thanking you in advance.

gene sequence ncbi • 2.5k views

ADD COMMENT • link updated 7.7 years ago by Ramaraj K • 0 • written 7.7 years ago by Dramapeach • 0

score 0 · Answer 1 · 2017-11-06

0

Entering edit mode

7.7 years ago

Puli Chandramouli Reddy ▴ 190

Hi,

You can download the fasta files from a list of gene IDs at "https://www.ncbi.nlm.nih.gov/sites/batchentrez". You have to give a file with list of gene IDs and then save the records using "send to" option to a file in fasta format (by default it shows "summary" filed which can be changed to "FASTA").

ADD COMMENT • link 7.7 years ago by Puli Chandramouli Reddy ▴ 190

score 0 · Answer 2 · 2017-11-07

Hi, you can use the simple Python script to download the nucleotide sequence in FASTA format. Before that, you need to keep all the Entrez gene IDs in a text (.txt) file like,

11820
351
54226

To simplify the script, Biopython package was used in the following code.

from Bio import Entrez
id = open ('\gene_id_list.txt').readlines() # Rediret to the .txt directory
for i in id:
    i = i.strip()
    seq = open (i + '.fasta', 'w')
    handle = Entrez.efetch(db = "nucleotide", id=i, rettype = 'fasta')
    seq.write(handle.read())