when I remove a ind with vcftools --remove-indv, I lost all information column 8
4.5 years ago

when I try to removed or filtered data, I lost the information of column 8. as I can get that information? Please help me!!

my original file

##fileformat=VCFv4.0
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the reference and alternate alleles in the order listed">
##FORMAT=<ID=GQ,Number=1,Type=Float,Description="Genotype Quality">
##FORMAT=<ID=PL,Number=3,Type=Float,Description="Normalized, Phred-scaled likelihoods for AA,AB,BB genotypes where A=ref and B=alt; not applicable if site is not biallelic">
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  H1_P13:C615BACXX:3:250445260    ....
1   36078   S1_36078    T   A,C,G   20  PASS    NS=276;DP=2250;AF=0,01,0,01,0,01    GT:AD:DP:GQ:PL  0/0:6,0,0,0:6:98:0,18,216   0/0:6,0,0,0:6:98:0,18,216   0/0:2,0,0,0:2:79:0,6,72 0/0:11,0,0,0:11:99:0,33,255 0/0:7,0,0,0:7:99:0,21,252   0/0:5,0,0,0:5:96:0,15,180   0/0:14,0,0,0:14:99:0,42,255 0/1:7,1,0,0:8:93:12,0,228   0/0:12,0,1,0:13:99:0,36,255 0/0:12,0,0,0:12:99:0,36,255 0/0:8,0,0,0:8:99:0,24,255   0/1:6,1,0,0:7:96:15,0,195   ./. 0/0:17,0,0,0:17:99:0,51,255 ./. 0/0:10,0,0,0:10:99:0,30,255 0/0:5,0,0,0:5:96:0,15,180   0/0:7,0,0,0:7:99:0,21,252   0/0:3,0,0,0:3:88:0,9,108    0/0:1,0,0,0:1:66:0,3,36 0/0:7,0,0,0:7:99:0,21,252   0/0:7,0,0,0:7:99:0,21,252   0/0:6,0,0,0:6:98:0,18,216   0/0:7,0,0,0:7:99:0,21,252   0/0:11,0,0,0:11:99:0,33,255 0/0:7,0,0,0:7:99:0,21,252   0/0:2,0,0,0:2:79:0,6,72 0/0:1,0,0,0:1:66:0,3,36 0/0:4,0,0,0:4:94:0,12,144   0/0:5,0,0,0:5:96:0,15,180   0/1:9,1,0,0:10:79:6,0,255   0/0:4,0,0,0:4:94:0,12,144   0/0:9,0,0,0:9:99:0,27,255   0/0:12,0,0,0:12:99:0,36,255 0/0:10,0,0,0:10:99:0,30,255 0/0:6,0,0,0:6:98:0,18,216   0/0:1,0,0,0:1:66:0,3,36 0/0:13,0,0,0:13:99:0,39,255 0/0:7,0,0,0:7:99:0,21,252   0/0:8,0,0,0:8:99:0,24,255   0/0:3,0,0,0:3:88:0,9,108    0/0:12,0,0,0:12:99:0,36,255 0/0:4,0,0,0:4:94:0,12,144   0/0:8,0,0,0:8:99:0,24,255   0/0:3,0,0,0:3:88:0,9,108    0/3:2,0,0,1:3:99:27,0,63    0/0:13,0,0,0:13:99:0,39,255 0/0:3,0,0,0:3:88:0,9,108    0/0:7,0,0,0:7:99:0,21,252   0/0:5,0,0,0:5:96:0,15,180   0/0:1,0,0,0:1:66:0,3,36 0/0:4,0,0,0:4:94:0,12,144   ./. 0/0:3,0,0,0:3:88:0,9,108    ........



modified file

##fileformat=VCFv4.0
##Tassel=<ID=GenotypeTable,Version=5,Description="Reference allele is not known. The major allele was used as reference allele">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the reference and alternate alleles in the order listed">
##FORMAT=<ID=GQ,Number=1,Type=Float,Description="Genotype Quality">
##FORMAT=<ID=PL,Number=.,Type=Float,Description="Normalized, Phred-scaled likelihoods for AA,AB,BB genotypes where A=ref and B=alt; not applicable if site is not biallelic">
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  H33A_P100:C615BACXX:1:250444958 .......
1   42139   S1_42139    C   A   .   PASS    .   GT  1/0 0/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/1 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 0/0 1/0 1/0 1/1 1/1 1/0 1/0 0/0 1/0 1/0 1/0 0/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 0/0 0/0 1/0 1/0 0/0 1/0 0/0 1/0 1/0 1/0 1/0 1/0 0/0 1/0 1/1 1/0 1/0 1/0 0/0 1/0 0/0 1/0 1/0 1/0 1/0 1/0 1/0 1/0 ......


4.5 years ago
Ram 32k

Please read the manual. You have to use the --keep-info flag. It's literally called "keep INFO" :)

Read "INFO FIELD FILTERING" under https://vcftools.github.io/man_latest.html#SITE%20FILTERING%20OPTIONS

but not recalculate the INFO, only keep this

Might have helped if you'd mentioned that in the original post :) I'm not sure how INFO can be recalculated on the fly by such a simple tool - maybe a GATK Walker (such as SelectVariants) might help.