Entering edit mode
2.2 years ago
r.kalampaliki
•
0
Hello!
I have two VCF files as output of the gatk pipeline, one for including the snps and one the indels. I would like to count the number of snps and the number of indels per sample. The VCF files correspond to 30 samples, from which I am interested in 11, that I have listed in my_samples_list.txt.
My code:
while read sample_name;
do
n_indels=$(bcftools view -s $sample_name recal_indels.vcf | grep -v "^#" | wc -l)
n_snps=$(bcftools view -s $sample_name recal_snps.vcf | grep -v "^#" | wc -l)
printf "%s\t%s\t%s\n" "$sample_name" "$n_snps" "$n_indels" >> n_of_variants.txt
done < ./my_samples_list.txt
Desired output:
sample_name #_snps #_indels
HG00321 111 222
HG00323 333 444