Downloading filtered PDB files
1
0
Entering edit mode
2.6 years ago
Caterina • 0

I want to download 28000 fasta files from the PDB. I applied some filters on the resolution, length, etc of the proteins. However, I cannot directly download them from the website since it only allows a 2500-sequences batch download. How can I make this download programatically? NOTE that I don't have the PDB ids, I just queried the PDB using some filters.

PDB • 1.2k views
ADD COMMENT
2
Entering edit mode
2.6 years ago
Mensur Dlakic ★ 27k

It should not be a problem to download PDB IDs after you are done with selection. There should be a tabular output option where you can choose IDs and save them instead of seeing structures. Once you have the IDs, feed them one by one (or in parallel) as a variable $name to this command:

wget -q -o /dev/null  ftp://ftp.ebi.ac.uk/pub/databases/msd/pdb_uncompressed/pdb{$name}.ent
ADD COMMENT
0
Entering edit mode

Yes, I thought of doing this. But it doesn't even let me download that many IDs. Am I doing something wrong?

enter image description here

ADD REPLY
0
Entering edit mode

I don't know what exactly you are doing, but I just searched for protein and got more than 46K hits. When I asked for a display of IDs in tabular format, it said that at most 25K IDs can be downloaded at a time. It split the list in two as you can see below. Still, I was able to get all of them as two files.

PS You may need to right-hand click on image and open it in a new tab in order to see things properly.

enter image description here

ADD REPLY
0
Entering edit mode

Thanks! for some reason if i ask for a custom report with just PDB ids it only let's me download 2500, but doing it directly as you did works fine :) thanks!

ADD REPLY

Login before adding your answer.

Traffic: 2862 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6