Finding arbitrary sequences in Seq objects from Biopython
1
0
Entering edit mode
8.8 years ago
knpayne2 • 0

Let's say I have a large database of cdna sequences in the FASTA format, and I would like to identify a motif in the corresponding amino acid sequence. Let's say I need to find something like:

CxxCxxxxxxxxxxxxHxxx$

where $ will be H or C

I imagine one would start by parsing the fasta files, find the sites where these sub-sequences have to be, then

translate the corresponding coding DNA sequence, then I end up with an amino acid sequence that contains a sequence of this form. If I had a specific amino acid sequence in mind, I could easily find it by using the .find() method in the biopython module. However, I'm not sure how one can try to identify a form like above, in which one would search for a set of motifs.

Thanks!

python biopython sequence • 3.9k views
ADD COMMENT
0
Entering edit mode

The questions needs clarity at quite a few places. To start with, you have a database of sequences of which type? "FASTA" is the format, gives us nothing on the type of the underlying sequence.

ADD REPLY
0
Entering edit mode

Sorry, these are cdna sequences that are parsed from a set of FASTA files. Then I translate the sections between the KpnI and BamHI sites. With the amino acid sequence, I then need to find a sub-sequence that matches the pattern:

CxxCxxxxxxxxxxxxHxxx$

where $ will be H or C

I hope that is more clear.

ADD REPLY
1
Entering edit mode
8.8 years ago
Asaf 10k

You can start with reading the sequences using: fain = SeqIO.parse('filename.fa', 'fasta'), then iterate the sequences: for seqrc in fain: and for each sequence translate() it and use re (regular expression) to find your pattern.

ADD COMMENT
0
Entering edit mode

BioPython now has a Bio.Motifs package, the MEME suite can help you scan for a list of motifs.

ADD REPLY
0
Entering edit mode

Note Bio.Motif was deprecated, you'd want Bio.motifs (lower case with an s).

ADD REPLY
0
Entering edit mode

Yes, I'd also have considering using a regular expression (via import re) here.

ADD REPLY

Login before adding your answer.

Traffic: 2691 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6