Question

Closed:A question on MEME results

0

Entering edit mode

4.8 years ago

Hughie ▴ 30

Hello everyone,
Recently I'm searching for a sequence pattern from some fasta sequences using MEME, I have 821897 sequences in total fed into MEME for de novo motif searching using meme default parameters meme -nmotifs 3 file.fa -searchsize 1520000 -oc file_meme -seed 0620 -dna -revcomp and found a significantly strong motif like this (here I say the motif is strong because of 821411/821897, this may be argued):

Use all 821897 sequences:

all-seqs

I naturally think that, given so strong motif, the motif will remain largely similar when I randomly choose some sequences, however, things became weird when I sampled 3 times of 500000 sequeces like below:

Use sampled 500000 sequences for three times: shuf1 shuf2 shuf3

It seems these three motifs are all strong still, but vary a lot. I am not sure what I did wrong, and your advice would be much appreciated.

P.S. I add the result of the motif generated by Weblogo3 for comparison.
Use all 821897 sequences: weblogo
Use sampled 500000 sequences for three times: shuf1 shuf2 shuf3

The sampled 500000 sequence for MEME and Weblogo is exactly same, My questions are:
1. Why the motif generated by MEME using almost all sequences is different to Weblogo's, which also used all sequences. I know that MEME will use some algorithm to refine motif, and weblogo simply stack all base nucleotides, but will this differ so much?
2. Why three sampled results of Weblogo are similar, but differ a lot in MEME's?
Thank you for your time!

MEME motif • 270 views

ADD COMMENT • link 4.8 years ago by Hughie ▴ 30