Using COBALT from within a Python script?
Entering edit mode
3 months ago
ngarber ▴ 60

I'm fairly new to sequence analysis in Python, but what I want to do is:

  • Take a string (15aa peptide sequence) and find the best alignment (no gaps) by aligning against another string (a protein sequence)
  • Get the best-aligned matching 15aa sequence as a new string - must be 15aa with no gaps

I'm used to doing that with COBALT on the web interface, but I'm not sure how to do that from within Python - is there a way to do it in BioPython or the command line (i.e. with os)?

BLAST motif Python homology protein • 241 views
Entering edit mode

As a starting point for your last couple of clauses, you might want to look at the code and examples in a series of recent exchanges here and here using Biopython. Fortunately, your case is easier than that because you don't want gaps. So you can add a condition to filter all the returned alignments so that all of them are equal to the length of the input string.

I personally haven't used Cobalt, but the abstract published describing it says there's files available you can run and the README at the listed FTP site describes how to run it on Linux. If you have your heart set on using it in conjunction with Python, you probably could check out how I did a similar thing with Patmatch here. Go there, click launch binder and work through the notebooks to see an example of linking a command line program to Python various ways in using Python running in Jupyter. (Note this doesn't cover all the ways you can do this, such as subprocess or os.system(), but may be a good start to consider options.) You'd probably want to check out the 'Advanced: Sending PatMatch output directly to Python' one under 'Additional topics' as well.

Entering edit mode

I don't think this is a comment. It should be a bona fide answer.


Login before adding your answer.

Traffic: 1536 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6