Finding Common Motifs In Sequences
9
16
Entering edit mode
14.6 years ago
Fabio ▴ 130

I have a few hundred yeast sequences (20-80bp long) and I want to find common motifs (conserved bases at certain indices) in them. I am using a Mac

motif • 21k views
ADD COMMENT
11
Entering edit mode
14.0 years ago
Stew ★ 1.4k

I would recommend MEME as others, however Weeder is also very easy to install and run on a Mac. It is word based and runs a lot faster than Meme, with a far fewer options.

If you would like something to run specifically on your Mac you could try iMotifs, which incorporates the NestedMICA algorithm for discovering over-represented motifs.

For a small set of sequences, such as yours, you could also use Meme, Weeder or others online without installing them locally. Finally, the Regulatory Sequence Analysis Tools (RSAT) (http://rsat.ulb.ac.be/rsat/ or http://rsat.ccb.sickkids.ca/) are a great place to start for pattern matching and discovery.

In case it is of any use I have put up a short presentation about DNA motif finding from a course I gave earlier this year. You can find it here.

ADD COMMENT
6
Entering edit mode
14.6 years ago
Tom Koerber ▴ 60

You can also use MEME: http://meme.sdsc.edu/.

ADD COMMENT
6
Entering edit mode
14.1 years ago

Some time ago I used SOMBRERO with a good degree of success on finding motifs in a very diverse set of sequences. They have a Mac version for download as well as parallel versions for Irix and Linux.

ADD COMMENT
5
Entering edit mode
14.1 years ago
Michael 54k

The first step when looking for conservation of single bases or motives is often a multiple sequence alignment that will align the sequences in a way such that conserved regions are best visible. This can be a first step before using explicit motif finders like MEME. A good way of visualizing multiple alignments is the sequence-logo that will give a graphical representation of base conservation.

Here is the wikipedia list of multiple-sequence alignment tools.

I recommend to start with the EBI web-server of ClustalW though, if that is not enough you can also try MAFFT or T-Coffee.

Weblogo can generate sequence-logo graphics from the output and also from fasta input directly.

Advantage of these tools is that you don't need to install them, so good for a first attempt irrespective of using a Mac.

ADD COMMENT
4
Entering edit mode
14.2 years ago

Meme has been the first program to be published for doing that.

As an alternative you can find one of the EMBOSS tools; if you are scared by a terminal and want to do it from a web-based interface, you can use the EMBOSS tools from galaxy

ADD COMMENT
3
Entering edit mode
14.1 years ago
Darked89 4.6k

You may check out these pages:

These are ca 2 years old (links may not work etc.) but as a starting point should be OK. Also in unlikely case you did not found it yet: in yeast there has been an extensive motif search study done by Kellis with insane number of citations:

ADD COMMENT
2
Entering edit mode
14.6 years ago
Zhenhai Zhang ▴ 170

Try this out? http://fraenkel.mit.edu/webmotifs/form.html

ADD COMMENT
2
Entering edit mode
14.6 years ago
Suk211 ★ 1.1k
ACGGGCCCGACGATGCGTCGTA
ACGTACGTCGAACCGTCGTCGT
ACGTGCGTCGAAACGTCAGTCG
ACGGGTTCGATCGTCGTCGTCG

may be in Python I will break down the first sequence of required motif length into a sliding window and will search for those list of motifs in the rest of sequences using regular expression in python using re.search() method.

ADD COMMENT
2
Entering edit mode

post the python code as well (put it into pre tags then it will be shown nicely formatted, see help on the right)

ADD REPLY
0
Entering edit mode
10.3 years ago
sanchezcavani ▴ 220

I would recommend to use MEME.

ADD COMMENT

Login before adding your answer.

Traffic: 2958 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6