Question: Retrieving Swissprot FT details for a particular amino acid position using Python
gravatar for a.gardner
4.4 years ago by
United Kingdom
a.gardner10 wrote:


I am new so my apologies if I phrase things incorrectly.

I am trying to extract information from a Swissprot file for a mutation. For example I need to know the Domain it is positioned in and the secondary structure. 

The input variables are: acc_number (the swissprot accession code), wild_aa (wild amino acid), position (that is the number amino acid in the protein), mutant_aa (the replacement amino acid).

I have got as far as retrieving the features using 

for feature in record.features:
    print feature

How do I extract the secondary structure information from this to determine whether my amino acid is in a strand, helix, turn or random coil (if no secondary structure is recorded then i would say none). 

My full code to date is you will guess I am a total beginner with Python and Biopython (in fact with programming!):

Thanks in advance

#!/usr/bin/env python

import time
import sys # this module provides access to the input variables
import os
from Bio import ExPASy # this will allow a Swiss-Prot file to be opened
                       # over the internet using the accession number.
from Bio import SwissProt #this will allow the file to be read.

# This section receives the parameters from user input via the website:
# This will be commented out during the development period and temp. 
# variables will be used.

# acc_number = sys.argv[1]
# wild_aa = sys.argv[2]
# position = sys.arg[3]
# mutant_aa = sys.arg[4]

#Temp variables for developing:

acc_number = 'P01308'
wild_aa = 'L'
position = '43'
mutant_aa = 'P'

# next step is to retrieve the text file from swissprot to parse.
# this uses the acc_number variable:
handle = ExPASy.get_sprot_raw(acc_number)

# this reads the swissprot file:
record =

# test to see if record has been retrieved:
print record.description

# next section will parse the sequence information using the position variable
# will determine the secondary structure location of the mutation

#obtaining sequence and placing it in a variable.
sequence = record.sequence
#print sequence

# accssing the secondary structure and domain information from FT lines
for feature in record.features:
    print feature

# Check that the wild amino acid is correct


ADD COMMENTlink modified 4.4 years ago • written 4.4 years ago by a.gardner10
gravatar for RamRS
4.4 years ago by
Houston, TX
RamRS21k wrote:

The documentation refers to features as a tuple with key name, start, from, description. To see how a record looks like in plain text, check this out:

I've never used Bio.SwissProt.Record before, but you should probably be able to get feature[0] where feature[1] <= position and feature[2] >= position

This is the logic, implementing it in Python should not be a problem.





ADD COMMENTlink written 4.4 years ago by RamRS21k

Thank you, it works (though I feel a total numpty!)

ADD REPLYlink written 4.4 years ago by a.gardner10

More on....Python Tuple

ADD REPLYlink written 6 months ago by janemanny0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1926 users visited in the last hour