I'm working on some SNP analyses using GATK Haplotype Caller. In my initial test, with just 35 samples, I was getting over 350,000 SNPs. However, when I added in more samples this reduced. With my entire set of over 200 samples, it's barely over 1000, though if I cut about half of those out it's closer to 10,000. These are a conglomeration from 4 different data sets, but my original data set included two of the most different collection practices. I can't find anything in the documentation that seems to explain why this would happen, considering I've tried with the GVCF mode and got similar results. I'm imagining it's some outcome of the method of variant calling, but I want to make sure. Could anyone explain what could be causing this?
Question: Why am I getting fewer variants with more samples?
12 months ago by
Ace • 70
Ace • 70 wrote:
ADD COMMENT • link •
Please log in to add an answer.
Powered by Biostar version 2.3.0
Traffic: 1440 users visited in the last hour