I would like to have some advice in comparing a composition between a real genome against a randomized genome.
The question is about the randomization (as a background or Null model). When I randomize(schuffling the genome) a genome for comparing kmer composition It is important to keep the base frequency or a dinucleotide sequence?
I did read some papers but they used a expected value instead!
But I would like to look for a count method just similar to the kmer count method.
Any tip or paper using a similar approach would be appreciated!