Downloading all COI sequences from BOLD database
1
0
Entering edit mode
5.2 years ago
bioinfo ▴ 840

Hi all,

Is there a way to download all COI sequences from BOLD database (http://www.boldsystems.org/index.php/Public_SearchTerms)? I tried to download all sequences from the search button of "Public Data Portal" without any search term but it returns zero hits.

I also tried with a search term of "Arthorapoda" which actually returned

Found 4,791,963 published records,
with 4,791,963 records with sequences,
forming 402,549 BINs (clusters),
with specimens from 243 countries,
deposited in 1,566 institutions.

Of these records, 2,282,392 have species names, and represent 200,331 species.

With download options on the top right.

I was thinking if any of you have used a better option such as API or via command line to download all COI sequences?

bold coi barcoding • 4.8k views
ADD COMMENT
0
Entering edit mode
5.2 years ago
GenoMax 146k

Looks like entire data file is available in tsv format here. This is the latest Dec 31, 2015 release.

ADD COMMENT
0
Entering edit mode

It seems the TSV file contains only a set of plant sequences (rbcL and some matK, n=1900 sequences) from 2015 release.

ADD REPLY
0
Entering edit mode

I am thinking the API solution such as below will do the job. I'm testing it now.

wget http://v3.boldsystems.org/index.php/API_Public/sequence?marker=COI-5P

More info are available here about API http://v3.boldsystems.org/index.php/resources/api?type=webservices#sequenceParameters

ADD REPLY
0
Entering edit mode

Looks like that would work too.

ADD REPLY
0
Entering edit mode

There are many other release here. You may need to look through them.

ADD REPLY

Login before adding your answer.

Traffic: 2040 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6