Question: How to extract sample IDs from VCF into new VCF?
0
gravatar for dec986
5 weeks ago by
dec986230
United States
dec986230 wrote:

I've written a script that extracts VCF IDs from a large VCF, in order to get a subset of the sample IDs (people).

This script copies the header from the original VCF and simply pastes it into the new one. I then extract only the sample IDs that I need for each line in the original VCF.

However, I want to validate that this new VCF is valid, using vcf-validator.

vcf-validator ALL.chr18.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf 2>&1 >/dev/null | head
18:10083 .. AN is 5008, should be 124
18:10083 .. AC is 1, should be 0
...

These errors appear to be caused by the INFO field:

grep -A1 -m1 ^#CHROM ALL.chr18.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf | cut -f 1-19
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  HG00096 NA19625 NA19700 NA19701NA19703  NA19704 NA19707 NA19711 NA19712 NA19713
18  10083   esv3641458  A   <CN0>   100 PASS    AC=1;AF=0.000199681;AN=5008;CIEND=-500,1000;CIPOS=-1000,500;CS=DEL_union;END=63656;NS=2504;SVTYPE=DEL;DP=69995;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0.001;SAS_AF=0;VT=SV;EX_TARGET GT  0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0

so then I try to just replace the INFO field AC=1;AF=0.000199681;AN=5008;CIEND=-500,1000;CIPOS=-1000,500;CS=DEL_union;END=63656;NS=2504;SVTYPE=DEL;DP=69995;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0.001;SAS_AF=0;VT=SV;EX_TARGET GT with just . but then I get bizarre errors like

column NA19923 at 18:10083 .. FORMAT tag [.] not listed in the header

which I don't know how to fix.

Is there a way that I can easily extract given sample IDs from a VCF?

vcf • 117 views
ADD COMMENTlink written 5 weeks ago by dec986230
1
gravatar for Pierre Lindenbaum
5 weeks ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

I've written a script that extracts VCF IDs from a large VCF, in order to get a subset of the sample IDs (people).

isn't it a job for bcftools view --samples ?

ADD COMMENTlink written 5 weeks ago by Pierre Lindenbaum129k

Indeed, this is the solution

ADD REPLYlink written 5 weeks ago by dec986230
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 672 users visited in the last hour