I am currently using MEME for Motif Discovery and I would like to check about 50 to 100 bases upstream for binding factors (say they are represented around -35 and -10 usually). I have a local installation of MEME.
I have about 30K upstream sequences and I am not able to run the algorithm even with -maxsize set to any high values and I get
Dataset too large (-1) Rerun with larger -maxsize
How can I address this problem?
Also, as an extension to this question:
I am expecting to find more than one motif conserved (say at both -35 & -10) in different subsets of the 30K Sequence. How can I make such specifications (of location range of motif) while running MEME? or is there a variant of MEME that does this particularly?
As much as I understood PSP file, I am not able to understand what exactly does bgfile do in MEME motif discovery?