Question: Retrieving Swissprot FT details for a particular amino acid position using Python
0
gravatar for a.gardner
4.4 years ago by
a.gardner10
United Kingdom
a.gardner10 wrote:

Hi,

I am new so my apologies if I phrase things incorrectly.

I am trying to extract information from a Swissprot file for a mutation. For example I need to know the Domain it is positioned in and the secondary structure. 

The input variables are: acc_number (the swissprot accession code), wild_aa (wild amino acid), position (that is the number amino acid in the protein), mutant_aa (the replacement amino acid).

I have got as far as retrieving the features using 

for feature in record.features:
    print feature

How do I extract the secondary structure information from this to determine whether my amino acid is in a strand, helix, turn or random coil (if no secondary structure is recorded then i would say none). 

My full code to date is below...as you will guess I am a total beginner with Python and Biopython (in fact with programming!):

Thanks in advance

#!/usr/bin/env python

import time
import sys # this module provides access to the input variables
import os
from Bio import ExPASy # this will allow a Swiss-Prot file to be opened
                       # over the internet using the accession number.
from Bio import SwissProt #this will allow the file to be read.

# This section receives the parameters from user input via the website:
# This will be commented out during the development period and temp. 
# variables will be used.

# acc_number = sys.argv[1]
# wild_aa = sys.argv[2]
# position = sys.arg[3]
# mutant_aa = sys.arg[4]

#Temp variables for developing:

acc_number = 'P01308'
wild_aa = 'L'
position = '43'
mutant_aa = 'P'

# next step is to retrieve the text file from swissprot to parse.
# this uses the acc_number variable:
handle = ExPASy.get_sprot_raw(acc_number)

# this reads the swissprot file:
record = SwissProt.read(handle)

# test to see if record has been retrieved:
print record.description

# next section will parse the sequence information using the position variable
# will determine the secondary structure location of the mutation

#obtaining sequence and placing it in a variable.
sequence = record.sequence
#print sequence

# accssing the secondary structure and domain information from FT lines
for feature in record.features:
    print feature

# Check that the wild amino acid is correct

 

ADD COMMENTlink modified 4.4 years ago • written 4.4 years ago by a.gardner10
2
gravatar for RamRS
4.4 years ago by
RamRS21k
Houston, TX
RamRS21k wrote:

The documentation refers to features as a tuple with key name, start, from, description. To see how a record looks like in plain text, check this out: http://www.uniprot.org/uniprot/P01308.txt

I've never used Bio.SwissProt.Record before, but you should probably be able to get feature[0] where feature[1] <= position and feature[2] >= position

This is the logic, implementing it in Python should not be a problem.

References:

1. http://www.uniprot.org/uniprot/P01308.txt

2. http://biopython.org/DIST/docs/api/Bio.SwissProt.Record-class.html

3. http://www.tutorialspoint.com/python/python_tuples.htm

ADD COMMENTlink written 4.4 years ago by RamRS21k

Thank you, it works (though I feel a total numpty!)

ADD REPLYlink written 4.4 years ago by a.gardner10

More on....Python Tuple

ADD REPLYlink written 6 months ago by janemanny0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1926 users visited in the last hour