Summery: my grep command is not working properly when applied to a vcf but worked fine on a dummy test file. The grep command is putting too many records..
I am trying to pull all records with 1000 Genomes AF < 0.5 from a vcf. the vcf is annotated and for each SNP the AF from 1KGenomes is under an info column "controls_AF_popmax"
This is the surrounding area from an entry:
This is my grep:
zless my_file.vcf.gz | grep -v '^#' | grep ';controls_AF_popmax=0\.0[0-4]\|;controls_AF_popmax=\.;' > output.txt
It is pulling records where the AF value is between 1 and 2.338e-05 and the "."
I tried a test .txt and the function worked well:
where the result is: