bcftools filtering on INFO fields
Entering edit mode
9 months ago

below is a piece of my VCF where I try to extract rows with SUPPORT>10 but fail to specify such filter.

I tried: bcftools filter -sFilterName -e 'INFO/SUPPORT>10' variants.vcf (typo edited) result is no filtering with all the input going through I do not find a page with working examples

bcftools 1.9

##ALT=<ID=DUP:TANDEM,Description="Tandem Duplication">
##ALT=<ID=DUP:INT,Description="Interspersed Duplication">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant">
##INFO=<ID=CUTPASTE,Number=0,Type=Flag,Description="Genomic origin of interspersed duplication seems to be deleted">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="Difference in length between REF and ALT alleles">
##INFO=<ID=SUPPORT,Number=1,Type=Integer,Description="Number of reads supporting this variant">
##INFO=<ID=STD_SPAN,Number=1,Type=Float,Description="Standard deviation in span of merged SV signatures">
##INFO=<ID=STD_POS,Number=1,Type=Float,Description="Standard deviation in position of merged SV signatures">
##INFO=<ID=STD_POS1,Number=1,Type=Float,Description="Standard deviation of breakend 1 position">
##INFO=<ID=STD_POS2,Number=1,Type=Float,Description="Standard deviation of breakend 2 position">
##FILTER=<ID=hom_ref,Description="Genotype is homozygous reference">
##FILTER=<ID=not_fully_covered,Description="Tandem duplication is not fully covered by a single read">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read depth">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Read depth for each allele">
##FORMAT=<ID=CN,Number=1,Type=Integer,Description="Copy number of tandem duplication (e.g. 2 for one additional copy)">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  Sample
CA_Cp   0       svim.BND.1      N       ]CA_Cp:150583]N 1       PASS    SVTYPE=BND;SUPPORT=1;STD_POS1=.;STD_POS2=.      GT:DP:AD        ./.:.:.,.
CA_Cp   2       svim.BND.2      N       ]CA_Cp:152088]N 3       PASS    SVTYPE=BND;SUPPORT=3;STD_POS1=1;STD_POS2=500    GT:DP:AD        ./.:.:.,.
CA_Cp   3       svim.INV.1      N       <INV>   0       PASS    SVTYPE=INV;END=85158;SUPPORT=95;STD_SPAN=3.49;STD_POS=1.44      GT:DP:AD        ./.:.:.,.
CA_Cp   3       svim.BND.3      N       ]CA_Cp:153260]N 4       PASS    SVTYPE=BND;SUPPORT=4;STD_POS1=2;STD_POS2=418    GT:DP:AD        ./.:.:.,.
CA_Cp   7       svim.BND.4      N       ]CA_Cp:154652]N 29      PASS    SVTYPE=BND;SUPPORT=27;STD_POS1=26;STD_POS2=309  GT:DP:AD        ./.:.:.,.
CA_Cp   7       svim.BND.5      N       ]CA_Cp:155004]N 89      PASS    SVTYPE=BND;SUPPORT=87;STD_POS1=17;STD_POS2=268  GT:DP:AD        ./.:.:.,.
CA_Cp   7       svim.DUP_TANDEM.1       N       <DUP:TANDEM>    1       not_fully_covered       SVTYPE=DUP:TANDEM;END=87179;SVLEN=87172;SUPPORT=1;STD_SPAN=.;STD_POS=.  GT:CN:DP:AD
CA_Cp   8       svim.BND.6      N       ]CA_Cp:154580]N 27      PASS    SVTYPE=BND;SUPPORT=25;STD_POS1=26;STD_POS2=309  GT:DP:AD        ./.:.:.,.
CA_Cp   8       svim.BND.7      N       ]CA_Cp:155166]N 94      PASS    SVTYPE=BND;SUPPORT=985;STD_POS1=17;STD_POS2=88  GT:DP:AD        ./.:.:.,.
CA_Cp   11      svim.BND.8      N       ]CA_Cp:154272]N 15      PASS    SVTYPE=BND;SUPPORT=14;STD_POS1=35;STD_POS2=259  GT:DP:AD        ./.:.:.,.
CA_Cp   13      svim.BND.9      N       ]CA_Cp:154140]N 13      PASS    SVTYPE=BND;SUPPORT=12;STD_POS1=38;STD_POS2=326  GT:DP:AD        ./.:.:.,.
CA_Cp   17      svim.BND.10     N       ]CA_Cp:153931]N 9       PASS    SVTYPE=BND;SUPPORT=9;STD_POS1=44;STD_POS2=367   GT:DP:AD        ./.:.:.,.
CA_Cp   122     svim.DEL.1      N       <DEL>   1       PASS    SVTYPE=DEL;END=165;SVLEN=-43;SUPPORT=1;STD_SPAN=.;STD_POS=.     GT:DP:AD        ./.:.:.,.
CA_Cp   368     svim.DEL.2      N       <DEL>   1       PASS    SVTYPE=DEL;END=424;SVLEN=-56;SUPPORT=1;STD_SPAN=.;STD_POS=.     GT:DP:AD        ./.:.:.,.
CA_Cp   699     svim.BND.11     N       ]CA_Cp:153209]N 1       PASS    SVTYPE=BND;SUPPORT=1;STD_POS1=.;STD_POS2=.      GT:DP:AD        ./.:.:.,.
CA_Cp   910     svim.DEL.3      N       <DEL>   1       PASS    SVTYPE=DEL;END=970;SVLEN=-60;SUPPORT=1;STD_SPAN=.;STD_POS=.     GT:DP:AD        ./.:.:.,.
CA_Cp   1346    svim.DEL.4      N       <DEL>   1       PASS    SVTYPE=DEL;END=1397;SVLEN=-51;SUPPORT=1;STD_SPAN=.;STD_POS=.    GT:DP:AD        ./.:.:.,.
CA_Cp   1547    svim.INS.1      N       <INS>   1       PASS    SVTYPE=INS;END=1547;SVLEN=56;SUPPORT=1;STD_SPAN=.;STD_POS=.     GT:DP:AD        ./.:.:.,.
variants vcf bcftools variant • 338 views
Entering edit mode

Isn't it INFO/ and not INFO\? Maybe that's why the filter doesn't work well.

Entering edit mode

You are right, only '/' is valid but does not work for me Should the VCF be bgzipped and indexed or does this normally work on plain text?

Entering edit mode

One of these bad days, I used -e (exclude) instead of -i (include) Sorry about this! It now works (of course)

Entering edit mode

Glad you found the problem. In the future, please use Add Comment when you're adding a comment or Add Reply when you're replying to a comment. Only use Add Answer when you're answering the top-level question.


Login before adding your answer.

Traffic: 1793 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6