I have multiple vcf file and want to extract allele frequency and the hetero- and homozygous call with their count. Could you please let me know if I should merge all vcf file to get one vcf file, and then extract the required information? If yes, I read CombineVariants from GATK do this job, but never use it, please kindly tell me is it OK for my task in your opinion?
For extracting the required information (allele frequency and the hetero- and homozygous call with their count), I found that convert2annovar.pl from annovar package help to get this information. However, when I tested on the example vcf file containing 3 for getting zygosity information by the below command:
perl convert2annovar.pl ex2.vcf --format vcf4 --allsample --withzyg --outfile file1.vcf
It gave me 3 output files corresponded to 3 samples with hetero and homo information, could you please kindly tell me how I can have all information in one file? Assuming multi-sample vcf file containing 100 samples, so 100 output files will be generated, how we could handle them?
Also, I tried vcftools to obtain the required information, but sounds that it give us just frequency, is any experience with vcftools for getting zygosity information? Any alternative tools and commands suggestion would be highly appreciated.