Question: Extract several fields from vcf file
1
gravatar for kelly.wang135
2.3 years ago by
Korea, Republic Of
kelly.wang13530 wrote:

Hello all, I'd like to modify my vcfs to contain several info only. There are "GT:AD:DP:FT:GQ:PL:PP" in format column, but I want to include "GT,DP,GQ" only.

I ran "--extract-FORMAT-info" of vcftools, but the result file is not vcf format. What I want to have finally, is vcf file with "GT,DP,GQ" fields only.

Does anyone know how to handle this? Thanks.

vcf • 1.8k views
ADD COMMENTlink modified 19 months ago by erdiazval40 • written 2.3 years ago by kelly.wang13530
1

Have you tried AWK aleady?

ADD REPLYlink written 2.3 years ago by Benn7.4k

Actually I extracted these fields with python. But when I tried to analyze this file using rare variant association tool like pseq, vtools, and rvtest, something is not working properly.

I wonder it there are validated tools to extract, not with linux or python.

ADD REPLYlink written 2.3 years ago by kelly.wang13530

Can you please provide a line having GT:AD:DP:FT:GQ:PL:PP from the vcf file?

ADD REPLYlink written 2.3 years ago by lakhujanivijay4.3k

Thanks, but I am looking for tools to handle this.

ADD REPLYlink written 2.3 years ago by kelly.wang13530
2
gravatar for Len Trigg
2.2 years ago by
Len Trigg1.3k
New Zealand
Len Trigg1.3k wrote:

Using RTG Tools:

rtg vcfsubset -i input.vcf.gz -o output.vcf.gz --keep-format GT,DP,GQ
ADD COMMENTlink written 2.2 years ago by Len Trigg1.3k
1
gravatar for nsmi8446
2.2 years ago by
nsmi8446120
nsmi8446120 wrote:

Have you come across SnpSift in your search for tools to do this?

http://snpeff.sourceforge.net/SnpSift.html

Section 10 in the link above (SnpSift documentation) could potentially be useful: 10. SnpSift Extract Fields

ADD COMMENTlink written 2.2 years ago by nsmi8446120
1
gravatar for erdiazval
19 months ago by
erdiazval40
erdiazval40 wrote:

I would go for using SNPSift program This is an instance of how I use it to extract: position, ref allele, alt allele, Allele Depth (AD from genotype field), and functional annotation by gene ID.

!/bin/bash
for i in *.vcf;
do java -jar /data/software/snpEff/snpEff/SnpSift.jar\
extractFields "$i"\
POS REF ALT GEN[*].AD ANN[*].GENEID > "filt_${i}";
done
ADD COMMENTlink written 19 months ago by erdiazval40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 719 users visited in the last hour