Question: Extract several fields from vcf file
1
gravatar for kelly.wang135
13 months ago by
Korea, Republic Of
kelly.wang13530 wrote:

Hello all, I'd like to modify my vcfs to contain several info only. There are "GT:AD:DP:FT:GQ:PL:PP" in format column, but I want to include "GT,DP,GQ" only.

I ran "--extract-FORMAT-info" of vcftools, but the result file is not vcf format. What I want to have finally, is vcf file with "GT,DP,GQ" fields only.

Does anyone know how to handle this? Thanks.

vcf • 861 views
ADD COMMENTlink modified 5 months ago by erdiazval30 • written 13 months ago by kelly.wang13530
1

Have you tried AWK aleady?

ADD REPLYlink written 13 months ago by b.nota4.0k

Actually I extracted these fields with python. But when I tried to analyze this file using rare variant association tool like pseq, vtools, and rvtest, something is not working properly.

I wonder it there are validated tools to extract, not with linux or python.

ADD REPLYlink written 13 months ago by kelly.wang13530

Can you please provide a line having GT:AD:DP:FT:GQ:PL:PP from the vcf file?

ADD REPLYlink written 13 months ago by Vijay Lakhujani2.5k

Thanks, but I am looking for tools to handle this.

ADD REPLYlink written 13 months ago by kelly.wang13530
2
gravatar for Len Trigg
13 months ago by
Len Trigg1.1k
New Zealand
Len Trigg1.1k wrote:

Using RTG Tools:

rtg vcfsubset -i input.vcf.gz -o output.vcf.gz --keep-format GT,DP,GQ
ADD COMMENTlink written 13 months ago by Len Trigg1.1k
1
gravatar for nsmi8446
13 months ago by
nsmi844630
nsmi844630 wrote:

Have you come across SnpSift in your search for tools to do this?

http://snpeff.sourceforge.net/SnpSift.html

Section 10 in the link above (SnpSift documentation) could potentially be useful: 10. SnpSift Extract Fields

ADD COMMENTlink written 13 months ago by nsmi844630
1
gravatar for erdiazval
5 months ago by
erdiazval30
erdiazval30 wrote:

I would go for using SNPSift program This is an instance of how I use it to extract: position, ref allele, alt allele, Allele Depth (AD from genotype field), and functional annotation by gene ID.

!/bin/bash
for i in *.vcf;
do java -jar /data/software/snpEff/snpEff/SnpSift.jar\
extractFields "$i"\
POS REF ALT GEN[*].AD ANN[*].GENEID > "filt_${i}";
done
ADD COMMENTlink written 5 months ago by erdiazval30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 689 users visited in the last hour