Question: Unphase a VCF file
0
gravatar for shubhamsaini
3.0 years ago by
shubhamsaini0 wrote:

What's is a quick way to unphase all the genotypes in a VCF file? i.e. I want all the GT values to be of the form x/y (instead of x|y)

unphase vcf • 1.7k views
ADD COMMENTlink modified 3.0 years ago by Pierre Lindenbaum129k • written 3.0 years ago by shubhamsaini0
3
gravatar for Kevin Blighe
3.0 years ago by
Kevin Blighe63k
Kevin Blighe63k wrote:

This sed one-liner in BASH appears to work for me:

sed '/^##/! s/|/\//g' INPUT.vcf > OUTPUT.vcf

...or to replace directy in the file without creating a new one, use sed -i ...

[tested on linux / Ubuntu 16.04]

The first part of the sed command (^##/!) means that it won't replace pipe symbols found in the VCF header. I can't imagine that pipe symbols would be used anywhere else in the VCF main body, other than [possibly] when an annotation program adds custom annotation to the INFO column.

Another possibility would be to use awk in BASH in order to specifically change values in a particular column, but this would get cumbersome with multi-sample VCFs.

Kevin

ADD COMMENTlink modified 5 weeks ago • written 3.0 years ago by Kevin Blighe63k

An awk approach which only effects FMT columns...

awk -F $'\t' '\ BEGIN {OFS = FS} /^[#]/ {print; next} { for (i = 10; i<=NF; i++) { gsub("\|","/",$i) } print }'

ADD REPLYlink written 7 months ago by travcollier160

the above didn't work for me, I had to modify it to:

awk -F $'\t' ' BEGIN {OFS = FS} /^[#]/ {print; next} { for (i = 10; i<=NF; i++) { gsub("\\|","/",$i) } print }'
ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by jjmmii0
1
gravatar for Pierre Lindenbaum
3.0 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

using vcffilterjdk: http://lindenb.github.io/jvarkit/VcfFilterJdk.html

java -jar dist/vcffilterjdk.jar -e 'return new VariantContextBuilder(variant).genotypes(variant.getGenotypes().stream().map(G->new GenotypeBuilder(G).phased(false).make()).collect(Collectors.toList())).make();' input.vcf

tested with:

wget -O - "https://github.com/vcflib/vcflib/blob/master/samples/scaffold612.phased.vcf?raw=true" | java -jar dist/vcffilterjdk.jar -e 'return new VariantContextBuilder(variant).genotypes(variant.getGenotypes().stream().map(G->new GenotypeBuilder(G).phased(false).make()).collect(Collectors.toList())).make();'
ADD COMMENTlink written 3.0 years ago by Pierre Lindenbaum129k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 693 users visited in the last hour