I would like to know if there are there any tools to impute missing SNPs for a resequencing dataset of hundreds of individuals for which there is no reference panel?
The closest I got is:
(1) in using a subset of the most complete individuals with the "include reference" MACH (http://www.sph.umich.edu/csg/abecasis/MaCH) option:
INCLUDE REFERENCE (e.g. HAPMAP) GENOTYPES TO YOUR DATASET:
If you select this option, you should simply create one large pooled dataset. Some individuals will have missing data and others will have much more complete genotyping information.
In addition to estimating the most likely genotype for each individual, you can use the command line options --dosage and --quality options to request additional information about each inferred genotype.
Other tools I looked at:
(2) in using Impute (https://mathgen.stats.ox.ac.uk/impute/impute_v2.html):
We have proposed a simple and universal solution to this problem: we provide all available reference haplotypes to IMPUTE2, then let the software choose a "custom" reference panel for each individual to be imputed. There are several advantages to this approach
Beagle (http://faculty.washington.edu/browning/beagle/beagle.html) doesn't seem to mention options for including the reference in the input.