Question: Extract several fields from vcf file
1
gravatar for kelly.wang135
9 months ago by
Korea, Republic Of
kelly.wang13530 wrote:

Hello all, I'd like to modify my vcfs to contain several info only. There are "GT:AD:DP:FT:GQ:PL:PP" in format column, but I want to include "GT,DP,GQ" only.

I ran "--extract-FORMAT-info" of vcftools, but the result file is not vcf format. What I want to have finally, is vcf file with "GT,DP,GQ" fields only.

Does anyone know how to handle this? Thanks.

vcf • 559 views
ADD COMMENTlink modified 4 weeks ago by erdiazval10 • written 9 months ago by kelly.wang13530
1

Have you tried AWK aleady?

ADD REPLYlink written 9 months ago by b.nota3.6k

Actually I extracted these fields with python. But when I tried to analyze this file using rare variant association tool like pseq, vtools, and rvtest, something is not working properly.

I wonder it there are validated tools to extract, not with linux or python.

ADD REPLYlink written 9 months ago by kelly.wang13530

Can you please provide a line having GT:AD:DP:FT:GQ:PL:PP from the vcf file?

ADD REPLYlink written 9 months ago by Vijay Lakhujani1.7k

Thanks, but I am looking for tools to handle this.

ADD REPLYlink written 9 months ago by kelly.wang13530
2
gravatar for Len Trigg
9 months ago by
Len Trigg1.1k
New Zealand
Len Trigg1.1k wrote:

Using RTG Tools:

rtg vcfsubset -i input.vcf.gz -o output.vcf.gz --keep-format GT,DP,GQ
ADD COMMENTlink written 9 months ago by Len Trigg1.1k
1
gravatar for nsmi8446
9 months ago by
nsmi844620
nsmi844620 wrote:

Have you come across SnpSift in your search for tools to do this?

http://snpeff.sourceforge.net/SnpSift.html

Section 10 in the link above (SnpSift documentation) could potentially be useful: 10. SnpSift Extract Fields

ADD COMMENTlink written 9 months ago by nsmi844620
0
gravatar for erdiazval
4 weeks ago by
erdiazval10
erdiazval10 wrote:

I would go for using SNPSift program This is an instance of how I use it to extract: position, ref allele, alt allele, Allele Depth (AD from genotype field), and functional annotation by gene ID.

!/bin/bash
for i in *.vcf;
do java -jar /data/software/snpEff/snpEff/SnpSift.jar\
extractFields "$i"\
POS REF ALT GEN[*].AD ANN[*].GENEID > "filt_${i}";
done
ADD COMMENTlink written 4 weeks ago by erdiazval10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1328 users visited in the last hour