Question: MEME: how to get figures of conserved motifs?
0
gravatar for biolab
5.2 years ago by
biolab1.2k
biolab1.2k wrote:

Dear all,

I run MEME using the command meme fasta_file -dna -w 20 -o output. In the output folder, the meme.html file shows no motifs were discovered.  However, the meme.txt file clearly shows there are many motifs with very low p-values, and inside the output folder there is a figure illustrating the top motif.  What's the problem with my command?

Thank you very much!  I appreciate any of your comments.

meme • 2.2k views
ADD COMMENTlink modified 5.2 years ago by eromasko120 • written 5.2 years ago by biolab1.2k
1
gravatar for eromasko
5.2 years ago by
eromasko120
United States
eromasko120 wrote:

Hi biolab. If I remember correctly, for your command with no changes to -nmotifs, only 1 motif will be returned even if there is more than 1 significant identified motif. When you say the meme.txt output shows many motifs with very low p-values, are they less than 0.05? Also, I'm not sure what you mean when you say that in the output folder there is a figure illustrating the top motif but not in the HTML output. I think the output stores the logos as .eps and .png even if they are not significant. You can check the manual at http://meme-suite.org/doc/meme.html?man_type=web for more information. I hope this helps you and please let us know if there are still issues.

ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by eromasko120

Hi, eromasko, thank you very much for your detailed answer!  Your description is really helpful.  I missed -nmotifs option.  Now I think I can handle MEME.   I just have one more question: you say "significant", so how to evaluate significance (e.g. what's the p-value cutoff) ?  Thanks again.

ADD REPLYlink written 5.2 years ago by biolab1.2k
1

Actually, I should have put E-value earlier instead of p-value, as that is what is returned. The significant cutoff is value is still less than 0.05. Here is a short description of the E-value in the HTML output of MEME (also, you can find more information within the manual on the link I posted earlier):

"The statistical significance of the motif. MEME usually finds the most statistically significant (low E-value) motifs first. It is unusual to consider a motif with an E-value larger than 0.05 significant so, as an additional indicator, MEME displays these partially transparent.

The E-value of a motif is based on its log likelihood ratio, width, sites, the background letter frequencies (given in the command line summary), and the size of the training set.

The E-value is an estimate of the expected number of motifs with the given log likelihood ratio (or higher), and with the same width and site count, that one would find in a similarly sized set of random sequences (sequences where each position is independent and letters are chosen according to the background letter frequencies)."

ADD REPLYlink written 5.2 years ago by eromasko120

Hi, eromasko , really thank you for your guidance. I understand E-value is important. I am trying MEME now with different settings. Although I am still having problems (for example, if dataset is large, i need to set -maxsize option), let me take a try.  THANKS!

ADD REPLYlink modified 5.2 years ago • written 5.2 years ago by biolab1.2k
1

I have some experience with doing some semi-large datasets and have used -maxsize up to 500000. It starts to get really computationally- and time-intensive. For example, I had datasets of almost 500 sequences that were each 1000nt long and it starts to drag on, especially when you start considering multiple options like -nmotifs , -w , -minw , -maxw and -mod anr . To make my life easier, I started writing simple bash shell scripts in order to run the many iterations of commands and options overnight and during long time stretches so I wouldn't have to be there to start the next command when the previous one finished. Hopefully that could help you if you already aren't doing something similar. Good luck!

ADD REPLYlink written 5.2 years ago by eromasko120

Yes, I agree to test different options using shell script, and see which combinations are desirable.  The options you provided are helpful to me.  Thanks a lot!    --Biolab

ADD REPLYlink modified 5.2 years ago • written 5.2 years ago by biolab1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 810 users visited in the last hour