Subset vcf on info field
1
0
Entering edit mode
2.8 years ago
gevikoj888 • 0

Hi, I am starting with the latest VCF from ClinVar and I am trying to recover only the Pathogenic variations. To do so I did the following:

grep "CLNSIG=Pathogenic" ClinVar.vcf > body.vcf

and I then copy the header to this body.vcf and use it. However when I use it on multiple softwares I have the following error:

not a proper VCF-file.

So I don't know what I do wrong (?) but now I try to look for a way to filter variants on the INFO field within a VCF.

Thanks a lot

bcftools vcftools vcf clinvar • 990 views
ADD COMMENT
2
Entering edit mode
2.8 years ago

grep "CLNSIG=Pathogenic" ClinVar.vcf > body.vcf

with the command above, you're removing the vcf header.

You want:

grep -E '(^#|CLNSIG=Pathogenic)' ClinVar.vcf > body.vcf

or better+safer

bcftools view -i 'INFO/CLNSIG="Pathogenic"' -o body.vcf ClinVar.vcf

ADD COMMENT

Login before adding your answer.

Traffic: 3260 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6