1
0
Entering edit mode
28 days ago
Caterina • 0

I want to download 28000 fasta files from the PDB. I applied some filters on the resolution, length, etc of the proteins. However, I cannot directly download them from the website since it only allows a 2500-sequences batch download. How can I make this download programatically? NOTE that I don't have the PDB ids, I just queried the PDB using some filters.

PDB • 268 views
2
Entering edit mode
28 days ago
Mensur Dlakic ★ 14k

It should not be a problem to download PDB IDs after you are done with selection. There should be a tabular output option where you can choose IDs and save them instead of seeing structures. Once you have the IDs, feed them one by one (or in parallel) as a variable $name to this command: wget -q -o /dev/null ftp://ftp.ebi.ac.uk/pub/databases/msd/pdb_uncompressed/pdb{$name}.ent

0
Entering edit mode

Yes, I thought of doing this. But it doesn't even let me download that many IDs. Am I doing something wrong?

0
Entering edit mode

I don't know what exactly you are doing, but I just searched for protein and got more than 46K hits. When I asked for a display of IDs in tabular format, it said that at most 25K IDs can be downloaded at a time. It split the list in two as you can see below. Still, I was able to get all of them as two files.

PS You may need to right-hand click on image and open it in a new tab in order to see things properly.

0
Entering edit mode

Thanks! for some reason if i ask for a custom report with just PDB ids it only let's me download 2500, but doing it directly as you did works fine :) thanks!

Traffic: 2408 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.