Question: How To Download Atomic Coordinates For A Specific Chain From A Pdb File?
0
gravatar for edweena
7.4 years ago by
edweena10
edweena10 wrote:

Hi,

What is the easiest way to download atomic coordinates for specific chains from multiple PDB files?

pdb • 4.6k views
ADD COMMENTlink modified 10 months ago by M.O.L.S10 • written 7.4 years ago by edweena10
1
gravatar for Pappu
7.4 years ago by
Pappu1.9k
Pappu1.9k wrote:

The following should work fine. You can use Biopython for more flexibility.

for i in 3EML 2VTP 2VEZ 
do 
wget -c "http://www.pdb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId="$i -O $i.pdb
grep  ATOM $i.pdb | grep ' A  ' > $i\_A.pdb
done
ADD COMMENTlink written 7.4 years ago by Pappu1.9k

Thank you for your answer. Using your code, I got the atomic coordinates of A chain. If I need to get B chain from the first PDB file, C chain from the second one, how to change your code? sorry, I am new to perl. Could you give the name of the biopython module?

ADD REPLYlink written 7.4 years ago by edweena10

Replace A with B and so on. Read bash and grep manual.

ADD REPLYlink written 7.4 years ago by Pappu1.9k
1
gravatar for Woa
5.7 years ago by
Woa2.7k
United States
Woa2.7k wrote:

It is better IMHO to download the entire PDB file, containing all the chain co-ordinates and then parse using some dedicated PDB parsers of Biopython, Bioperl or R (Bio-3d package) rather than any regex type parsers.

ADD COMMENTlink written 5.7 years ago by Woa2.7k
0
gravatar for Hamish
5.7 years ago by
Hamish3.1k
UK
Hamish3.1k wrote:

Assuming that you are just interested in the ATOM records then you will need to take into account the structure of the ATOM records. These are not structured as delimited fields, but are instead position based, thus the identification of the specific chain records must look at a specific position in the record to get the correct ATOM records.

Something like the following Perl script will do the job:

#!/usr/bin/env perl

# Enable Perl warnings
use strict;
use warnings;

# Load modules.
use File::Basename;
use Getopt::Long qw(:config no_ignore_case bundling);
use LWP;

### Default parameters.
# Base URL for dbfetch service used to fetch PDB entries.
my $dbfetchBaseUrl = 'http://www.ebi.ac.uk/Tools/dbfetch/dbfetch';
# PDB entry identifier.
my $pdbEntryIdListStr = '3E3Q,3E3Q,3EML';
# PDB chain identifier.
my $pdbChainIdListStr = 'A,a,A';

my $scriptName = basename( $0, () );
my $usage = <<EOF Extract="" ATOM="" records="" for="" a="" specified="" chain="" from="" a="" PDB="" entry.="" Usage:="" $scriptName="" -h="" $scriptName="" [-e="" pdbId]="" [-c="" chainId]="" -e,="" --entry="" PDB="" entry="" identifier="" list.="" [$pdbEntryIdListStr]="" -c,="" --chain="" PDB="" chain="" identifier="" list="" (case="" sensitive).="" [$pdbChainIdListStr]="" The="" list="" of="" entry="" Ids="" and="" chains="" Ids="" are="" assumed="" to="" be="" parallel,="" so="" to="" get="" the="" ATOM="" records="" for="" 3E3Q="" chain="" 'A',="" 3E3Q="" chain="" 'a'="" and="" 3EML="" chain="" 'A'="" the="" parameters="" would="" be:="" $scriptName="" -e="" 3E3Q,3E3Q,3EML="" -c="" A,a,A="" -h,="" --help="" Help="" usage="" message.="" EOF="" ;="" my="" $helpFlag;="" GetOptions(="" 'entry|e="s'" ==""> \$pdbEntryIdListStr,
    'chain|c=s' => \$pdbChainIdListStr,
    'help|h' => \$helpFlag,
    );

if($helpFlag) {
    print $usage;
    exit(0);
}

my @pdbEntryIdList = split(/[ ,]+/, $pdbEntryIdListStr);
my @pdbChainIdList = split(/[ ,]+/, $pdbChainIdListStr);
if(scalar(@pdbEntryIdList) != scalar(@pdbChainIdList)) {
    print STDERR @pdbEntryIdList;
    print STDERR @pdbChainIdList;
    die 'PDB entry Id list and PDB chain Id list lengths differ.';
}
for(my $i = 0; $i < scalar(@pdbChainIdList); $i++) {
    my $pdbEntryId = $pdbEntryIdList[$i];
    my $pdbChainId = $pdbChainIdList[$i];
    # Get the entry.
    my $pdbEntry = &fetchEntry('pdb', $pdbEntryId);
    if($pdbEntry) {
        # Extract 'ATOM' records.
        my @records = split(/\n+/, $pdbEntry);
        my @header_records = grep(/^HEADER/, @records);
        my @atom_records = grep(/^ATOM  .{14} $pdbChainId/, @records);
        print @header_records;
        print @atom_records;
    }
}

sub fetchEntry {
    my $dbName = shift;
    my $entryId = shift;
    # Create an LWP UserAgent for making HTTP calls.
    my $ua = LWP::UserAgent->new();
    # Configure HTTP proxy support from environment.
    $ua->env_proxy;

    my $requestUrl = $dbfetchBaseUrl . '/' . $dbName . '/' . $entryId . '?style=raw';
    print STDERR $requestUrl, "\n";
    my $response = $ua->get($requestUrl);
    if($response->is_error) {
        die 'http status: ' . $response->code . ' ' . $response->message;
    }
    my $content = $response->content();
    if($content =~ m/^ERROR/) {
        warn $content;
    }
    return $content;
}
ADD COMMENTlink written 5.7 years ago by Hamish3.1k

Can you put comments in that?

ADD REPLYlink written 5 months ago by M.O.L.S10
0
gravatar for M.O.L.S
10 months ago by
M.O.L.S10
M.O.L.S10 wrote:

First, Specify which chain you want to look at.

Different PDB files may have different numbers of chains.

Use the chain methods to get the atoms and atomic coordinates.

If you are looking at multiple PDB files you have to find a way to wrap that together in a for loop.

ADD COMMENTlink modified 5 months ago • written 10 months ago by M.O.L.S10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 880 users visited in the last hour