I'm doing a theoretical project on haplotype assembly, and I need a reasonable method for generating a realistic diploid stretch of DNA. For example, if the density of heterozygous alleles is approximately one every thousand bases, then I would generate a diploid version by randomly mutating every position with probability 1/1000.

So in a typical human genome, what is the density of heterozygous alleles? Or in other words, what is the mean number of base pairs between heterozygous alleles?

Thanks, Heng. I went for simple in my answer, but clearly there is much more detail in "doing this right".