Entering edit mode
5.2 years ago
bioinfo
▴
840
Hi all,
Is there a way to download all COI
sequences from BOLD database
(http://www.boldsystems.org/index.php/Public_SearchTerms)? I tried to download all sequences from the search button of "Public Data Portal" without any search term but it returns zero hits.
I also tried with a search term of "Arthorapoda
" which actually returned
Found 4,791,963 published records,
with 4,791,963 records with sequences,
forming 402,549 BINs (clusters),
with specimens from 243 countries,
deposited in 1,566 institutions.
Of these records, 2,282,392 have species names, and represent 200,331 species.
With download options on the top right.
I was thinking if any of you have used a better option such as API or via command line to download all COI sequences?
It seems the TSV file contains only a set of plant sequences (rbcL and some matK, n=1900 sequences) from 2015 release.
I am thinking the API solution such as below will do the job. I'm testing it now.
More info are available here about API http://v3.boldsystems.org/index.php/resources/api?type=webservices#sequenceParameters
Looks like that would work too.
There are many other release here. You may need to look through them.