Downloaded pdb's on rcsb.org
1
Hello everyone
I am working on molecular docking and I want to download some pdb's as pdb format according to search (name of protein, name of organism) on rcsb.org. Can someone help me if there is a way to do it, how it can be done?
Thanks for any help
rcsb
pdb
• 1.6k views
•
link
updated 8 weeks ago by
Ram
44k
•
written 6 months ago by
iamsmor
•
0
I'm going to build off of OP's query and give them a simple script:
organism=$(echo $1 | sed 's/ /%20/g')
gene=$2
curl -s https://search.rcsb.org/rcsbsearch/v2/query\?json\=%7B%22query%22%3A%7B%22type%22%3A%22group%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22group%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22terminal%22%2C%22service%22%3A%22text%22%2C%22parameters%22%3A%7B%22attribute%22%3A%22rcsb_entity_source_organism.rcsb_gene_name.value%22%2C%22negation%22%3Afalse%2C%22operator%22%3A%22exact_match%22%2C%22value%22%3A%22$gene%22%7D%7D%2C%7B%22type%22%3A%22group%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22group%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22terminal%22%2C%22service%22%3A%22text%22%2C%22parameters%22%3A%7B%22attribute%22%3A%22rcsb_entity_source_organism.ncbi_scientific_name%22%2C%22value%22%3A%22$organism%22%2C%22operator%22%3A%22exact_match%22%7D%7D%5D%2C%22logical_operator%22%3A%22or%22%2C%22label%22%3A%22rcsb_entity_source_organism.ncbi_scientific_name%22%7D%5D%2C%22logical_operator%22%3A%22and%22%7D%5D%2C%22logical_operator%22%3A%22and%22%2C%22label%22%3A%22text%22%7D%5D%2C%22logical_operator%22%3A%22and%22%7D%2C%22return_type%22%3A%22entry%22%2C%22request_options%22%3A%7B%22paginate%22%3A%7B%22start%22%3A0%2C%22rows%22%3A250%7D%2C%22results_content_type%22%3A%5B%22experimental%22%5D%2C%22sort%22%3A%5B%7B%22sort_by%22%3A%22score%22%2C%22direction%22%3A%22desc%22%7D%5D%2C%22scoring_strategy%22%3A%22combined%22%7D%2C%22request_info%22%3A%7B%22query_id%22%3A%2280f5cb00127713554e0dd5ce36ae71bd%22%7D%7D | grep identifier | cut -d: -f2 | tr -d ' ",'
Save it as get_my_data.bash
and then run it as
bash get_my_data.bash "Homo sapiens" AHR
Remember to provide the species in double quotes as it is a multi-word argument.
Sample runs:
$ bash get_my_data.bash "Homo sapiens" AHR
5NJ8
5V0L
7ZUB
8QMO
$ bash get_my_data.bash "Homo sapiens" TP53
1DT7
1JSP
1KZY
1MA3
1XQH
1YC5
1YCQ
1YCR
2B3G
2FEJ
2FOJ
2FOO
2GS0
2H2D
2H2F
2H4F
2H4H
2H4J
2H59
2K8F
2LY4
2MEJ
2MZD
2PCX
2RUK
..
..
..
Login before adding your answer.
Traffic: 900 users visited in the last hour
Download services for PDB are described on this page: https://www.rcsb.org/docs/programmatic-access/file-download-services
It could be as simple as grabbing a file with
curl/wget
using https://files.rcsb.org/view/4hhb.pdb as an example PDB accession.Thank you very much. Actually I looked at there, but actually I want to find something like according to search url like this QUERY: Gene Name = "AHR" AND Scientific Name of the Source Organism = "Homo sapiens" use something like bioython or I don't know made script for automating downloading process.
PDB has a search API: https://search.rcsb.org/#search-example-1
Here's the JSON from your search query:
Compare a JSON there and your example query to construct a custom JSON and use the API with that JSON.
You can use the "Advanced query" builder (https://www.rcsb.org/search/advanced ) to create a query like:
https://www.rcsb.org/search?request=%7B%22query%22%3A%7B%22type%22%3A%22group%22%2C%22logical_operator%22%3A%22and%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22group%22%2C%22logical_operator%22%3A%22and%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22group%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22terminal%22%2C%22service%22%3A%22text%22%2C%22parameters%22%3A%7B%22attribute%22%3A%22rcsb_entity_source_organism.taxonomy_lineage.name%22%2C%22operator%22%3A%22exact_match%22%2C%22negation%22%3Afalse%2C%22value%22%3A%22Homo%20sapiens%22%7D%7D%2C%7B%22type%22%3A%22terminal%22%2C%22service%22%3A%22text%22%2C%22parameters%22%3A%7B%22attribute%22%3A%22rcsb_entity_source_organism.rcsb_gene_name.value%22%2C%22operator%22%3A%22exact_match%22%2C%22negation%22%3Afalse%2C%22value%22%3A%22AHR%22%7D%7D%5D%2C%22logical_operator%22%3A%22and%22%7D%5D%2C%22label%22%3A%22text%22%7D%5D%7D%2C%22return_type%22%3A%22entry%22%2C%22request_options%22%3A%7B%22paginate%22%3A%7B%22start%22%3A0%2C%22rows%22%3A25%7D%2C%22results_content_type%22%3A%5B%22experimental%22%5D%2C%22sort%22%3A%5B%7B%22sort_by%22%3A%22score%22%2C%22direction%22%3A%22desc%22%7D%5D%2C%22scoring_strategy%22%3A%22combined%22%7D%2C%22request_info%22%3A%7B%22query_id%22%3A%2296ab84f1e1ba146fc2d50034b746143e%22%7D%7D
That's how they seem to have written their query - automating that is a bit of a pain though as it takes a crazy JSON as input.
For a non-programmer using the search builder link included above may be the best option. Even that is not very user friendly.