Download all Becteria and protist fasta protein sequences from UNIPROT proteomes
1
1
Entering edit mode
3.3 years ago
Chvatil ▴ 130

Hello everyone, I'm looking for a bash code in order to download from uniprot proteoms all the protein fasta sequences from Bacteria and protits proteoms, does someone know how I can do it please?

uniprot fetch fasta proteome bash • 2.4k views
ADD COMMENT
1
Entering edit mode

Not protists but you can download bacterial sequences from this page. Whole genome proteomes for Bacteria are here.

ADD REPLY
0
Entering edit mode

Hello, I downloaded the file : https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_sprot_bacteria.dat.gz and transform the .dat into .fasta using the python function Bio.SwissProt but I only get 335 066 fasta bacterial sequence despti the fact that when I type on uniprot : taxonomy:bacteria in the research tab I up to 151,792,141 bacterial sequence. Do you know why?

ADD REPLY
0
Entering edit mode

You have a better solution provided by Elisabeth Gasteiger below.

You can use seqret from EMBOSS to convert the dat files to fasta. I am not sure why you get a smaller number of entries. Perhaps redundant sequences are represented only once.

ADD REPLY
0
Entering edit mode

Ok I see, in fact I only download the swissprot part and not the Trembl part, I will check if the number of entries is good from that.

ADD REPLY
2
Entering edit mode
3.3 years ago

This help page on the UniProt website https://www.uniprot.org/help/api_downloading includes a code example to "Download the UniProt reference proteomes for all organisms below a given taxonomy node in compressed FASTA format"

ADD COMMENT
0
Entering edit mode

How fine, I'll try that one thanks

ADD REPLY
0
Entering edit mode

Hi, I used this technique but at the end I only found 1,335,574 fasta sequences instead of 151,792,141, any idea ?

I use the following command : perl perl_test.pl 2 (where perl_test.pl is the code in Uniprot webpage)

ADD REPLY

Login before adding your answer.

Traffic: 804 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6