I want to make a .vcf.gz from every individual in the 1000 genomes data. So I've downloaded all the .vcf.gz for all the chromosomes. I merged all the chromosomes into one big .vcf.gz. Now I want to create for every individual a separate .vcf
I normally use GATKSelectVariants to do that. However you also need to specify a human reference genome when using this GATK option & I think that's where I created a problem. Since all the single sample .vcf.gz come out empty (except for the header). Is there another option besides GATK? I used vcf-tools for other purposes before, but I noticed that this sometimes makes mistakes in the allele frequency when it splits a multi-sample file. Or which reference genome should I use if I want to make GATKSelectVariants work for the 1000genomes data?