Question: Compare 2 local protein FASTA using NCBI BLAST Biopython
0
gravatar for ahtmatrix
2.6 years ago by
ahtmatrix0
ahtmatrix0 wrote:

I have 2 protein sequences I need to compare using NCBI BLAST

  1. the protein sequence as listed in a gbk file identified by the CDS
  2. the protein sequence that is translated from the nucleotide sequence denoted by the CDS location

    import sys
    import os
    from Bio import SeqIO
    from Bio.SeqFeature import SeqFeature, FeatureLocation
    
    for record in SeqIO.parse(fullpath, "genbank"):
        if record.features:
            for feature in record.features:
                if feature.type == "CDS":
    
                    translated_protein = str(feature.qualifiers.get('translation', 'no_translation')).strip('\'[]')
                    cds_to_protein = str(feature.extract(record).seq.translate(to_stop = True))
    
                if translated_protein != cds_to_protein:
                ##run blast on translated_protein and cds_to_protein
    

Is that possible with Biopython?

blast biopython python ncbi • 1.1k views
ADD COMMENTlink written 2.6 years ago by ahtmatrix0
1

I'm not sure blast is the answer to your problem, which depends on what you want to obtain. But have a look at pairwise alignments in Biopython: http://biopython.org/DIST/docs/api/Bio.pairwise2-module.html

Based on my first impression it looks related to the blast algorithm.

ADD REPLYlink written 2.6 years ago by WouterDeCoster38k

Not an answer and just a comment. Legacy blast had a tool bl2seq which I still use a lot. It just aligns 2 sequences.

ADD REPLYlink written 2.6 years ago by microfuge1.0k

Agree with @WouterDeCoster: Doing a simple pairwise comparison can be easily done with the pairwise2 module of Biopython, like this:

from Bio import pairwise2
....
alignment = pairwise2.align.globalxx(translated_protein, cds_to_protein, one_alignment_only=True)
print(pairwise2.format_alignment(*alignment[0]))

The module has several more possibilities to define gap penalties, score matrices etc.

ADD REPLYlink written 2.6 years ago by Markus230
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 757 users visited in the last hour