Proper script to automate WGS analysis, typically bcftools
8 weeks ago
FL512

Dear everyone,

I have successfully extracted the necessary information by using below script; bcftools query -f '%CHROM \t%POS\n' sample1.vcf.gz > sample1.txt

However, I have more than 100 samples in my hand, therefore, I wanted to automate this process with the below script; bcftools query -f '%CHROM \t%POS\n' *.vcf.gz > *.txt

Then I got an error saying failed to read from sample2.vcf: not compressed with bgzip. Thus, I made bgzip files using a following script, htslib/bgzip -c sample1.vcf > sample1.vcf.gz, and redid the abovementioned script. It looked OK, there were no errors showed up, but I only got one txt file in the end, not for all vcf files.

1. My question is how I can fix this problems or how I can write a proper script to automate this process?
2. Additional question related to this is, how researchers/engineers in the world automate these labor works? When I compressed vcf files by bgzip to make vcf.gz files, I also did one by one manually, which I also would like to automate by running a proper script.
WGS bcftools linux automation
1
Entering edit mode

How I do a loop over multiple files in terminal?

Please google the terms loop or GNU parallel. This is basic Unix which you should learn.

0
Entering edit mode

Thank you very much! I do google it right away and sorry for the naive question.

