Hello, I have VEP annotated vcf files with following content:
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  file
chr1    183937  .       G       A       58.9    PASS    CSQ=||||||||||||MODIFIER|FO538757.1|ENSG00000279928|ENST00000624431|unprocessed_pseudogene||4/4|||||;AC=1;AN=2  GT:GQ:DP:AD:VAF:PL      0/1:51:26:15,11:0.423077:58,0,51
chr1    601436  .       C       T       4.9     PASS    CSQ=||||||||||||MODIFIER|AL669831.3|ENSG00000230021|ENST00000634337|processed_transcript|4/5||404||||,||||||||||||MODIFIER|AL669831.3|ENSG00000230021|ENST00000634833|processed_transcript|3/6||317||||;AC=1;AN=2    GT:GQ:DP:AD:VAF:PL      0/1:5:26:19,7:0.269231:3,0,17 
I would like to filter out protein coding variants, but get following errors:
bcftools view -f "protein_coding" file > out
[E::bcf_write] Broken VCF record, the number of columns at chrX:152737049 does not match the number of samples (0 vs 1)
[main_vcfview] Error: cannot write to (null)
bcftools filter -i 'BIOTYPE="protein_coding"' file > aaa 
[filter.c:2491 filters_init1] Error: the tag "BIOTYPE" is not defined in the VCF header
How should I filter such variants, if the field is in CSQ field between pipes?
Thank you!
Hello @storm1907, you asked many of questions in this community, which is totally fine. Though, almost none of the answers to these questions have received any upvotes or toggled an answer as accepted. Please take the time to acknowledge the effort the users have invested by upvoting helpful answers and comments. If an answer solved the issue please accept it.