Retrieving batch of selected regions sequences from a NCBI db
1
0
Entering edit mode
7.8 years ago

Hi,

I was wondering if there is any way of downloading a batch of selected regions from sequences from Genbank. I have several accession numbers for different contigs but I need only selected regions from them and not full sequences. Is there any way to download them in a batch and not doing that one by one, because that will take me ages to finish?

sequence ncbi biopython • 2.7k views
ADD COMMENT
0
Entering edit mode

Ok, it seems that with Efetch I can still download one sequence fragment at a time (or I'm just missing something, but I was trying for a while now with different options and still no success. I need to download 180 fragments of different contigs and it is getting a bit silly. I was thinking about using Biopython and their Entrez and SeqIO but I got kind of stuck. Input is a file which looks more or less like that:

NW_005871339.1  1   18951   28640
NW_005872753.1  1   4797    13835
NW_005872103.1  1   20997   29747
NW_005878789.1  1   0   5645

What I have so far. But not sure if that even has any chances to work in that form ...

from Bio import Entrez
from Bio import SeqIO

Entrez.email = "mail'
db = "nuccore"

input = SeqIO.parse('input.fasta', 'r')
output = open('output.fasta', 'w')

handle = Entrez.efetch(db=db, id=input[0], strand=input[1], seq_start=input[2], seq_end=input[3], rettype="fasta", retmode="text")
ADD REPLY
0
Entering edit mode

Great! Thanks for all your help. It seems I need to improve also my search strategy ;)

ADD REPLY
3
Entering edit mode
7.8 years ago
GenoMax 141k

Take a look at Entrez programming utilities from NCBI. Example from the link:

Fetch the first 100 bases of the plus strand of GI 21614549 in FASTA format:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=21614549&strand=1&seq_start=1&seq_stop=100&rettype=fasta&retmode=text

ADD COMMENT
0
Entering edit mode

looks exactly like a thing I'm looking for. Thank you! :)

ADD REPLY
0
Entering edit mode

If you have access to NCBI's pre-formatted blast databases you can probably use blastdbcmd to do the same locally with the -range and -entry options.

ADD REPLY

Login before adding your answer.

Traffic: 3846 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6