Finding The Position Of Amino Acid
4
5
Entering edit mode
13.4 years ago

I'm trying to find out the number & position of an amino acid, Lysine, in Trastuzumab. Does anyone have an idea of the appropriate software that I should use to determine this ?

Much appreciated.

amino-acids sequence position • 9.1k views
ADD COMMENT
11
Entering edit mode
13.4 years ago
Neilfws 49k

Just to expand on the answer by Bio_X2Y: this is a "classic" type of bioinformatics task, for which there is unlikely to be an online application or a ready-made software solution. A roll-your-own solution is almost expected; it's considered trivial yet nobody provides the software :-)

Here's one that uses the Bio::SeqIO module from Bioperl. Assuming that you have saved the light chain sequence in fasta format to the file lc.fa:

#!/usr/bin/perl -w

use strict;
use Bio::SeqIO;
my $inseq = Bio::SeqIO->new(-file => "lc.fa", -format => "fasta");

while(my $seq = $inseq->next_seq) {
  my @aa = split("", $seq->seq);
  my(@index) = grep { $aa[$_] =~ /K/i } 0..$#aa;
     @index  = map {$_ + 1} @index;  # convert 0-based array to 1-based sequence
  print "Lys at ", join(", ", @index), "\n";
}

Result:

Lys at 39, 42, 45, 103, 107, 126, 145, 149, 169, 183, 188, 190, 207
ADD COMMENT
7
Entering edit mode
13.4 years ago
Bio_X2Y ★ 4.4k

This is not my area, so don't treat this as authoritative.

Trastuzumab is an IgG-kappa monoclonal antibody. An antibody is made up of two identical heavy chains (in this case a gamma) and two identical light chains (in this case a kappa). So I imagine your question can be broken down into:

  • Where can I get the protein sequences for the trastuzumab heavy and light chains?
  • How can I find lysines within these chains?

I'm not familiar with where most people get official drug sequences, but the wikipedia page for trastuzumab provides a link to a DrugBank card, which provides sequences for both chains (plus some other variant formats that I don't understand).

There is software out there that can be used to find a particular amino acid in a given sequence (e.g. standalone BLAST), but I'm not familiar with anything that can be installed and run quickly. Since you are only dealing with two small sequences (heavy chain 451aa, light chain 214aa), I suggest you write a simple Perl script to find the lysine (K) residue positions (it should only require 5-10 lines). If you're not comfortable with scripting, or setting up software like BLAST, I suggest you manually identify the K's - hopefully it won't take more than a few minutes!

ADD COMMENT
0
Entering edit mode

Excellent answer.

ADD REPLY
5
Entering edit mode
13.4 years ago
Julien ▴ 160

The EMBOSS program, either standalone or web-based (eg. http://pro.genomics.purdue.edu/emboss/) has a fuzzpro program that takes a sequence and a pattern, K in this case, and gives the position on the sequence.

ADD COMMENT
0
Entering edit mode

Nice find. I forgot to look at EMBOSS; it has a tool for most occasions (and as you mention, web interfaces for non-coders).

ADD REPLY
1
Entering edit mode
13.4 years ago
Jake ▴ 150

For Antibody sequences you can use Abysis http://www.bioinf.org.uk/abysis/tools/analyze.cgi which will number your sequences using the standard kabat or chothia numbering scheme.

ADD COMMENT

Login before adding your answer.

Traffic: 3683 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6