Question: when I remove a ind with vcftools --remove-indv, I lost all information column 8
0
gravatar for claudia_io_p
2.2 years ago by
claudia_io_p0 wrote:

when I try to removed or filtered data, I lost the information of column 8. as I can get that information? Please help me!!

my original file

##fileformat=VCFv4.0
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the reference and alternate alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth (only filtered reads used for calling)">
##FORMAT=<ID=GQ,Number=1,Type=Float,Description="Genotype Quality">
##FORMAT=<ID=PL,Number=3,Type=Float,Description="Normalized, Phred-scaled likelihoods for AA,AB,BB genotypes where A=ref and B=alt; not applicable if site is not biallelic">
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  H1_P13:C615BACXX:3:250445260    ....
1   36078   S1_36078    T   A,C,G   20  PASS    NS=276;DP=2250;AF=0,01,0,01,0,01    GT:AD:DP:GQ:PL  0/0:6,0,0,0:6:98:0,18,216   0/0:6,0,0,0:6:98:0,18,216   0/0:2,0,0,0:2:79:0,6,72 0/0:11,0,0,0:11:99:0,33,255 0/0:7,0,0,0:7:99:0,21,252   0/0:5,0,0,0:5:96:0,15,180   0/0:14,0,0,0:14:99:0,42,255 0/1:7,1,0,0:8:93:12,0,228   0/0:12,0,1,0:13:99:0,36,255 0/0:12,0,0,0:12:99:0,36,255 0/0:8,0,0,0:8:99:0,24,255   0/1:6,1,0,0:7:96:15,0,195   ./. 0/0:17,0,0,0:17:99:0,51,255 ./. 0/0:10,0,0,0:10:99:0,30,255 0/0:5,0,0,0:5:96:0,15,180   0/0:7,0,0,0:7:99:0,21,252   0/0:3,0,0,0:3:88:0,9,108    0/0:1,0,0,0:1:66:0,3,36 0/0:7,0,0,0:7:99:0,21,252   0/0:7,0,0,0:7:99:0,21,252   0/0:6,0,0,0:6:98:0,18,216   0/0:7,0,0,0:7:99:0,21,252   0/0:11,0,0,0:11:99:0,33,255 0/0:7,0,0,0:7:99:0,21,252   0/0:2,0,0,0:2:79:0,6,72 0/0:1,0,0,0:1:66:0,3,36 0/0:4,0,0,0:4:94:0,12,144   0/0:5,0,0,0:5:96:0,15,180   0/1:9,1,0,0:10:79:6,0,255   0/0:4,0,0,0:4:94:0,12,144   0/0:9,0,0,0:9:99:0,27,255   0/0:12,0,0,0:12:99:0,36,255 0/0:10,0,0,0:10:99:0,30,255 0/0:6,0,0,0:6:98:0,18,216   0/0:1,0,0,0:1:66:0,3,36 0/0:13,0,0,0:13:99:0,39,255 0/0:7,0,0,0:7:99:0,21,252   0/0:8,0,0,0:8:99:0,24,255   0/0:3,0,0,0:3:88:0,9,108    0/0:12,0,0,0:12:99:0,36,255 0/0:4,0,0,0:4:94:0,12,144   0/0:8,0,0,0:8:99:0,24,255   0/0:3,0,0,0:3:88:0,9,108    0/3:2,0,0,1:3:99:27,0,63    0/0:13,0,0,0:13:99:0,39,255 0/0:3,0,0,0:3:88:0,9,108    0/0:7,0,0,0:7:99:0,21,252   0/0:5,0,0,0:5:96:0,15,180   0/0:1,0,0,0:1:66:0,3,36 0/0:4,0,0,0:4:94:0,12,144   ./. 0/0:3,0,0,0:3:88:0,9,108    ........

modified file

##fileformat=VCFv4.0
##Tassel=<ID=GenotypeTable,Version=5,Description="Reference allele is not known. The major allele was used as reference allele">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the reference and alternate alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth (only filtered reads used for calling)">
##FORMAT=<ID=GQ,Number=1,Type=Float,Description="Genotype Quality">
##FORMAT=<ID=PL,Number=.,Type=Float,Description="Normalized, Phred-scaled likelihoods for AA,AB,BB genotypes where A=ref and B=alt; not applicable if site is not biallelic">
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  H33A_P100:C615BACXX:1:250444958 .......
1   42139   S1_42139    C   A   .   PASS    .   GT  1/0 0/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/1 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 0/0 1/0 1/0 1/1 1/1 1/0 1/0 0/0 1/0 1/0 1/0 0/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 0/0 0/0 1/0 1/0 0/0 1/0 0/0 1/0 1/0 1/0 1/0 1/0 0/0 1/0 1/1 1/0 1/0 1/0 0/0 1/0 0/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 ......

tassel snp vcftools vcf • 773 views
ADD COMMENTlink modified 2.2 years ago by RamRS19k • written 2.2 years ago by claudia_io_p0
0
gravatar for RamRS
2.2 years ago by
RamRS19k
Houston, TX
RamRS19k wrote:

Please read the manual. You have to use the --keep-info flag. It's literally called "keep INFO" :)

Read "INFO FIELD FILTERING" under https://vcftools.github.io/man_latest.html#SITE%20FILTERING%20OPTIONS

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by RamRS19k

but not recalculate the INFO, only keep this

ADD REPLYlink written 2.2 years ago by claudia_io_p0

Might have helped if you'd mentioned that in the original post :) I'm not sure how INFO can be recalculated on the fly by such a simple tool - maybe a GATK Walker (such as SelectVariants) might help.

ADD REPLYlink written 2.2 years ago by RamRS19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1412 users visited in the last hour