Question: Filtering VCF file by INFO flag
1
gravatar for cl10101
3.4 years ago by
cl1010180
cl1010180 wrote:

I am trying to filter variants with dbSNP annotation. In my VCF file this information is contained in INFO column like this:

Y       59003592        .       A       G       .       .       NS=1;AN=1;AC=1;CGA_XR=dbsnp.96|rs2140187;CGA_SDO=2      GT:PS:FT:GQ:HQ:EHQ:CGA_CEHQ:GL:CGA_CEGL:DP:AD:CGA_RDP        1:.:PASS:2035:2035,.:2035,.:44,.:-2035,0:-44,0:157:146,.:11

Information from header:

##INFO=<ID=CGA_XR,Number=A,Type=String,Description="Per-ALT external database reference (dbSNP, COSMIC, etc)">

According to vcftools documentation there is option to filter sites with specific INFO flag (--keep-INFO). I've tried to use this:

vcftools --vcf file.vcf --out output --keep-INFO CGA_XR

but without success:

VCFtools - 0.1.15
(C) Adam Auton and Anthony Marcketta 2009

Parameters as interpreted:
    --vcf file.vcf
    --out output
    --keep-INFO CGA_XR

After filtering, kept 1 out of 1 Individuals
Error: Using INFO flag filtering on non flag type CGA_XR will not work correctly.

What is the proper usage of this function?

vcftools vcf • 4.2k views
ADD COMMENTlink modified 3.4 years ago by Pierre Lindenbaum129k • written 3.4 years ago by cl1010180
2
gravatar for Shane McCarthy
3.4 years ago by
Cambridge, Cambridgeshire
Shane McCarthy330 wrote:
bcftools view -i 'CGA_XR ~"dbsnp"' file.vcf
ADD COMMENTlink written 3.4 years ago by Shane McCarthy330
1
gravatar for Pierre Lindenbaum
3.4 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

vcffilterjs

 java -jar dist/vcffilterjs.jar -e 'variant.getAttributeAsString("CGA_XR","").startsWith("dbsnp")' file.vcf
ADD COMMENTlink written 3.4 years ago by Pierre Lindenbaum129k
0
gravatar for WouterDeCoster
3.4 years ago by
Belgium
WouterDeCoster44k wrote:

You could try:

cat <(grep '^#' yourfile.vcf) <(grep '|rs' yourfile.vcf) > outputfile.vcf

The first grep gets the header, the second grep every line containing a |rs pattern, which is hopefully specific enough. The result of both greps is cat together to create the output file.

ADD COMMENTlink written 3.4 years ago by WouterDeCoster44k
3
grep -E "(^#|\|rs)" in.vcf > out.vcf
ADD REPLYlink written 3.4 years ago by Pierre Lindenbaum129k

That's a delicious solution!

ADD REPLYlink written 3.4 years ago by WouterDeCoster44k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 630 users visited in the last hour