I have 11 individuals, WGS VCF files. They are related within three or so generations, and I'm trying to find IBDs between them (stretches of DNA (haplotypes) that are identical because they're related, 10^5-10^7 in length). My current method is to define IBD candidates as an unbroken series of (adjacent) SNPs in the VCF files that share at least one allele among all (or some chosen subset) of my 11 individuals.
I am getting strange results though. If I start by including only, say, 7 of the individuals, I get some number of IBD candidates. If I then add individuals, the number of IBD candidates should only go down, but it goes up sometimes... nonsensical to be sure. Any idea of what's going wrong here?
My VCFs aren't phased, sadly, which is why I must use the "at least one allele in common" condition.
Very thankful for any thoughts or advice!