I am using the de-novo transcription binding motif discovery software MEME.
The MEME suite documentation (here) claims you can use the '-bfile' argument to provide MEME with a background Markov model file.
I am asking whether anybody has got suggestions for how to generate such a -bfile, or better still if anybody could point me towards a script that generates such a file given .fasta sequence data.
Here is the description of the -bfile as per MEME Suite documentaiton:
BACKGROUND MODEL -bfile <bfile> — The name of the file containing the background model for sequences. The background model is the model of random sequences used by MEME. The background model is used by MEME during EM as the "null model", for calculating the log likelihood ratio of a motif, for calculating the significance (E-value) of a motif, and, for creating the position-specific scoring matrix (log-odds matrix). By default, the background model is a 0-order Markov model based on the letter frequencies in the training set.
Markov models of any order can be specified in <bfile> by listing frequencies of all possible tuples of length up to order+1. Note that MEME uses only the 0-order portion (single letter frequencies) of the background model for purposes 3) and 4), but uses the full-order model for purposes 1) and 2), above.
Example: To specify a 1-order Markov background model for DNA, <bfile> might contain the following lines. Note that optional comment lines are marked by "#" and are ignored by MEME.
# tuple frequency_non_coding a 0.324 c 0.176 g 0.176 t 0.324 # tuple frequency_non_coding aa 0.119 ac 0.052 ag 0.056 at 0.097 ca 0.058 cc 0.033 cg 0.028 ct 0.056 ga 0.056 gc 0.035 gg 0.033 gt 0.052 ta 0.091 tc 0.056 tg 0.058 tt 0.119
Sample -bfile files are given in directory tests: tests/nt.freq (DNA), and tests/na.freq (amino acid).