Question: REF, ALT not recoded after removing individual sample in VCFtools and VCFlib
0
gravatar for suzanne.mcgaugh
5.2 years ago by
United States
suzanne.mcgaugh60 wrote:

Hi, 

I have a VCF file and realized that one individual is an outlier and is often different than the others. I'd like to remove this individual and have the ALT and REF columns follow suit. For example, if only this removed individual had the ALT allele the ALT column would then have a "." in it instead of the ALT allele that is no longer present in the remaining samples. And same with indels.

I have tried this with VCFtools and VCFlib, but both do not recode the ALT and REF columns. Does anyone know of a tool that can do this without having to remake the entire VCF file?

Thank you,

Suzanne

alignment next-gen genome • 1.8k views
ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by suzanne.mcgaugh60

Thank you for your help. I am using your software and it is getting me most of the way there. Thanks! 

Would you please look at these lines:

I think the ref allele and INFO/FORMAT in 173 and 175 case no longer need the indication that it was deleted because those lines no longer exist in the dataset. Is this intentional?

 

KB871578.1    173    .    GC    .    44.11    PASS    AC=1;AF=0.014;AN=74;BaseQRankSum=-7.360e-01;ClippingRankSum=-7.360e-01;DP=179;FS=0.000;GQ_MEAN=11.57;GQ_STDDEV=12.08;InbreedingCoeff=-0.0871;MLEAC=1;MLEAF=0.014;MQ=60.00;MQ0=0;MQRankSum=-7.360e-01;NCC=9;QD=14.70;ReadPosRankSum=0.736;SOR=1.179    GT:AD:DP:GQ:PL    0/0:2,0:2:3:0,3,45    0/0:4,0:4:12:0,12,165    0/0:2,0:2:3:0,3,45    0/0:3,0:3:6:0,6,90    0/0:4,0:4:9:0,9,135    0/0:4,0:4:12:0,12,173    0/0:8,0:8:15:0,15,225    0/0:2,0:2:3:0,3,45    0/0:1,0:1:3:0,3,45    ./.    ./.    ./.    ./.    ./.    ./.    ./.    ./.    ./.    0/0:6,0:6:12:0,12,180    0/0:9,0:9:18:0,18,270    0/0:5,0:5:12:0,12,180    0/0:5,0:5:3:0,3,45    0/0:1,0:1:3:0,3,36    0/0:4,0:4:12:0,12,176    0/0:2,0:2:6:0,6,82    0/0:6,0:6:15:0,15,225    0/0:2,0:2:6:0,6,85    0/0:10,0:10:18:0,18,270    0/0:4,0:4:12:0,12,173    0/0:5,0:5:15:0,15,215    0/0:9,0:9:21:0,21,315    0/0:4,0:4:12:0,12,167    0/0:4,0:4:9:0,9,135    0/0:3,0:3:6:0,6,90    0/0:5,0:5:15:0,15,221    0/0:2,0:2:6:0,6,90    0/0:3,0:3:9:0,9,128    0/0:5,0:5:15:0,15,216    0/0:3,0:3:9:0,9,132    0/0:4,0:4:9:0,9,135    0/0:4,0:4:12:0,12,169    0/0:3,0:3:9:0,9,122    0/0:5,0:5:12:0,12,180    0/0:3,0:3:6:0,6,90
KB871578.1    174    .    C    .    .    PASS    AN=90;DP=173;NCC=1    GT:AD:DP    0/0:3:3    0/0:4:4    0/0:2:2    0/0:4:4    0/0:4:4    0/0:4:4    0/0:8:8    0/0:2:2    0/0:1:1    ./.    0/0:0:1    0/0:2:5    0/0:2:5    0/0:1:1    0/0:1:2    0/0:0:2    0/0:0:2    0/0:0:2    0/0:6:6    0/0:9:9    0/0:5:5    0/0:6:6    0/0:1:1    0/0:4:4    0/0:2:2    0/0:6:6    0/0:2:2    0/0:10:10    0/0:4:4    0/0:5:5    0/0:9:9    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:2:2    0/0:3:3    0/0:5:5    0/0:3:3    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:3:3
KB871578.1    175    .    TGCTGC    .    43.94    PASS    AC=1;AF=0.013;AN=76;BaseQRankSum=-7.360e-01;ClippingRankSum=0.736;DP=181;FS=0.000;GQ_MEAN=11.50;GQ_STDDEV=11.94;InbreedingCoeff=0.0019;MLEAC=1;MLEAF=0.013;MQ=60.00;MQ0=0;MQRankSum=0.736;NCC=8;QD=8.79;ReadPosRankSum=0.736;SOR=1.179    GT:AD:DP:GQ:PL    0/0:3,0:3:6:0,6,90    0/0:4,0:4:12:0,12,161    0/0:2,0:2:3:0,3,45    0/0:4,0:4:9:0,9,135    0/0:4,0:4:9:0,9,135    0/0:4,0:4:12:0,12,168    0/0:8,0:8:15:0,15,225    0/0:2,0:2:3:0,3,45    0/0:1,0:1:3:0,3,45    ./.    ./.    ./.    ./.    ./.    0/0:1,1:2:0:0,0,1    ./.    ./.    ./.    0/0:6,0:6:12:0,12,180    0/0:9,0:9:18:0,18,270    0/0:5,0:5:12:0,12,180    0/0:6,0:6:6:0,6,90    0/0:1,0:1:3:0,3,40    0/0:4,0:4:12:0,12,172    0/0:2,0:2:6:0,6,82    0/0:6,0:6:15:0,15,225    0/0:2,0:2:6:0,6,85    0/0:10,0:10:18:0,18,270    0/0:4,0:4:12:0,12,171    0/0:5,0:5:15:0,15,216    0/0:9,0:9:21:0,21,315    0/0:4,0:4:12:0,12,177    0/0:4,0:4:9:0,9,135    0/0:3,0:3:6:0,6,90    0/0:5,0:5:15:0,15,211    0/0:2,0:2:6:0,6,90    0/0:3,0:3:9:0,9,134    0/0:5,0:5:15:0,15,214    0/0:3,0:3:9:0,9,130    0/0:4,0:4:9:0,9,135    0/0:4,0:4:12:0,12,173    0/0:3,0:3:9:0,9,130    0/0:5,0:5:12:0,12,180    0/0:3,0:3:6:0,6,90
KB871578.1    176    .    G    .    .    PASS    AN=90;DP=170;NCC=1    GT:AD:DP    0/0:3:3    0/0:4:4    0/0:2:2    0/0:4:4    0/0:4:4    0/0:4:4    0/0:7:7    0/0:2:2    0/0:1:1    ./.    0/0:0:1    0/0:2:5    0/0:2:5    0/0:1:1    0/0:0:1    0/0:0:2    0/0:0:2    0/0:0:2    0/0:6:6    0/0:8:8    0/0:5:5    0/0:6:6    0/0:1:1    0/0:4:4    0/0:2:2    0/0:6:6    0/0:2:2    0/0:9:9    0/0:4:4    0/0:5:5    0/0:9:9    0/0:4:4    0/0:4:4    0/0:4:4    0/0:5:5    0/0:2:2    0/0:3:3    0/0:5:5    0/0:3:3    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:3:3
KB871578.1    177    .    C    .    .    PASS    AN=88;DP=169;NCC=2    GT:AD:DP    0/0:3:3    0/0:4:4    0/0:2:2    0/0:4:4    0/0:4:4    0/0:4:4    0/0:7:7    0/0:2:2    0/0:1:1    ./.    0/0:0:1    0/0:0:3    0/0:0:4    ./.    0/0:0:1    0/0:0:2    0/0:0:2    0/0:0:2    0/0:6:6    0/0:8:8    0/0:5:5    0/0:6:6    0/0:1:1    0/0:4:4    0/0:2:2    0/0:6:6    0/0:2:2    0/0:9:9    0/0:4:4    0/0:6:6    0/0:10:10    0/0:4:4    0/0:5:5    0/0:4:4    0/0:5:5    0/0:2:2    0/0:3:3    0/0:5:5    0/0:3:3    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:3:3
KB871578.1    178    .    T    .    .    PASS    AN=88;DP=169;NCC=2    GT:AD:DP    0/0:3:3    0/0:5:5    0/0:2:2    0/0:4:4    0/0:4:4    0/0:5:5    0/0:7:7    0/0:2:2    0/0:1:1    ./.    0/0:1:1    0/0:3:3    0/0:3:4    ./.    0/0:1:1    0/0:2:2    0/0:2:2    0/0:2:2    0/0:5:5    0/0:8:8    0/0:5:5    0/0:6:6    0/0:1:1    0/0:4:4    0/0:2:2    0/0:6:6    0/0:2:2    0/0:8:8    0/0:4:4    0/0:6:6    0/0:10:10    0/0:4:4    0/0:5:5    0/0:4:4    0/0:5:5    0/0:2:2    0/0:3:3    0/0:5:5    0/0:3:3    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:3:3
KB871578.1    179    .    G    .    .    PASS    AN=88;DP=168;NCC=2    GT:AD:DP    0/0:2:2    0/0:5:5    0/0:2:2    0/0:4:4    0/0:4:4    0/0:5:5    0/0:7:7    0/0:2:2    0/0:1:1    ./.    0/0:1:1    0/0:3:3    0/0:4:4    ./.    0/0:1:1    0/0:2:2    0/0:2:2    0/0:2:2    0/0:5:5    0/0:7:7    0/0:6:6    0/0:6:6    0/0:1:1    0/0:4:4    0/0:2:2    0/0:6:6    0/0:2:2    0/0:8:8    0/0:4:4    0/0:6:6    0/0:10:10    0/0:4:4    0/0:5:5    0/0:4:4    0/0:5:5    0/0:2:2    0/0:3:3    0/0:5:5    0/0:3:3    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:3:3
KB871578.1    180    .    C    .    .    PASS    AN=88;DP=170;NCC=2    GT:AD:DP    0/0:2:2    0/0:5:5    0/0:2:2    0/0:4:4    0/0:4:4    0/0:5:5    0/0:7:7    0/0:2:2    0/0:1:1    ./.    0/0:1:1    0/0:3:3    0/0:3:4    ./.    0/0:1:1    0/0:2:2    0/0:3:3    0/0:4:4    0/0:5:5    0/0:7:7    0/0:6:6    0/0:5:5    0/0:1:1    0/0:4:4    0/0:2:2    0/0:6:6    0/0:2:2    0/0:8:8    0/0:4:4    0/0:6:6    0/0:10:10    0/0:4:4    0/0:5:5    0/0:4:4    0/0:5:5    0/0:2:2    0/0:3:3    0/0:5:5    0/0:3:3    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:3:3

 

 

Thanks very much in advance!

ADD REPLYlink written 5.2 years ago by suzanne.mcgaugh60

yes, you can remove those lines with option '-r' ( remove variant if there is not any called genotype on the line. )

ADD REPLYlink written 5.2 years ago by Pierre Lindenbaum129k

Thank you for your quick response, here are my commands:

java -jar /panfs/roc/groups/14/mcgaughs/mcgaughs/tools/jvarkit/dist-1.128/vcfcutsamples.jar -f /panfs/roc/scratch/mcgaugh/VCFs/Not_AA.txt -r /panfs/roc/scratch/mcgaugh/VCFs/PASS_SNP_invariant_INDELTEST2.vcf>/panfs/roc/scratch/mcgaugh/VCFs/TEST_PL3.vcf

I also tried:

java -jar /panfs/roc/groups/14/mcgaughs/mcgaughs/tools/jvarkit/dist-1.128/vcfcutsamples.jar -r -f /panfs/roc/scratch/mcgaugh/VCFs/Not_AA.txt /panfs/roc/scratch/mcgaugh/VCFs/PASS_SNP_invariant_INDELTEST2.vcf>/panfs/roc/scratch/mcgaugh/VCFs/TEST_PL4.vcf

Both give:

173    .    GC    .    44.11    PASS    AC=1;AF=0.014;AN=74;BaseQRankSum=-7.360e-01;ClippingRankSum=-7.360e-01;DP=179;FS=0.000;GQ_MEAN=11.57;GQ_STDDEV=12.08;InbreedingCoeff=-0.0871;MLEAC=1;MLEAF=0.014;MQ=60.00;MQ0=0;MQRankSum=-7.360e-01;NCC=9;QD=14.70;ReadPosRankSum=0.736;SOR=1.179    GT:AD:DP:GQ:PL    0/0:2,0:2:3:0,3,45    0/0:4,0:4:12:0,12,165    0/0:2,0:2:3:0,3,45    0/0:3,0:3:6:0,6,90    0/0:4,0:4:9:0,9,135    0/0:4,0:4:12:0,12,173    0/0:8,0:8:15:0,15,225    0/0:2,0:2:3:0,3,45    0/0:1,0:1:3:0,3,45    ./.    ./.    ./.    ./.    ./.    ./.    ./.    ./.    ./.    0/0:6,0:6:12:0,12,180    0/0:9,0:9:18:0,18,270    0/0:5,0:5:12:0,12,180    0/0:5,0:5:3:0,3,45    0/0:1,0:1:3:0,3,36    0/0:4,0:4:12:0,12,176    0/0:2,0:2:6:0,6,82    0/0:6,0:6:15:0,15,225    0/0:2,0:2:6:0,6,85    0/0:10,0:10:18:0,18,270    0/0:4,0:4:12:0,12,173    0/0:5,0:5:15:0,15,215    0/0:9,0:9:21:0,21,315    0/0:4,0:4:12:0,12,167    0/0:4,0:4:9:0,9,135    0/0:3,0:3:6:0,6,90    0/0:5,0:5:15:0,15,221    0/0:2,0:2:6:0,6,90    0/0:3,0:3:9:0,9,128    0/0:5,0:5:15:0,15,216    0/0:3,0:3:9:0,9,132    0/0:4,0:4:9:0,9,135    0/0:4,0:4:12:0,12,169    0/0:3,0:3:9:0,9,122    0/0:5,0:5:12:0,12,180    0/0:3,0:3:6:0,6,90
174    .    C    .    .    PASS    AN=90;DP=173;NCC=1    GT:AD:DP    0/0:3:3    0/0:4:4    0/0:2:2    0/0:4:4    0/0:4:4    0/0:4:4    0/0:8:8    0/0:2:2    0/0:1:1    ./.    0/0:0:1    0/0:2:5    0/0:2:5    0/0:1:1    0/0:1:2    0/0:0:2    0/0:0:2    0/0:0:2    0/0:6:6    0/0:9:9    0/0:5:5    0/0:6:6    0/0:1:1    0/0:4:4    0/0:2:2    0/0:6:6    0/0:2:2    0/0:10:10    0/0:4:4    0/0:5:5    0/0:9:9    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:2:2    0/0:3:3    0/0:5:5    0/0:3:3    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:3:3
175    .    TGCTGC    .    43.94    PASS    AC=1;AF=0.013;AN=76;BaseQRankSum=-7.360e-01;ClippingRankSum=0.736;DP=181;FS=0.000;GQ_MEAN=11.50;GQ_STDDEV=11.94;InbreedingCoeff=0.0019;MLEAC=1;MLEAF=0.013;MQ=60.00;MQ0=0;MQRankSum=0.736;NCC=8;QD=8.79;ReadPosRankSum=0.736;SOR=1.179    GT:AD:DP:GQ:PL    0/0:3,0:3:6:0,6,90    0/0:4,0:4:12:0,12,161    0/0:2,0:2:3:0,3,45    0/0:4,0:4:9:0,9,135    0/0:4,0:4:9:0,9,135    0/0:4,0:4:12:0,12,168    0/0:8,0:8:15:0,15,225    0/0:2,0:2:3:0,3,45    0/0:1,0:1:3:0,3,45    ./.    ./.    ./.    ./.    ./.    0/0:1,1:2:0:0,0,1    ./.    ./.    ./.    0/0:6,0:6:12:0,12,180    0/0:9,0:9:18:0,18,270    0/0:5,0:5:12:0,12,180    0/0:6,0:6:6:0,6,90    0/0:1,0:1:3:0,3,40    0/0:4,0:4:12:0,12,172    0/0:2,0:2:6:0,6,82    0/0:6,0:6:15:0,15,225    0/0:2,0:2:6:0,6,85    0/0:10,0:10:18:0,18,270    0/0:4,0:4:12:0,12,171    0/0:5,0:5:15:0,15,216    0/0:9,0:9:21:0,21,315    0/0:4,0:4:12:0,12,177    0/0:4,0:4:9:0,9,135    0/0:3,0:3:6:0,6,90    0/0:5,0:5:15:0,15,211    0/0:2,0:2:6:0,6,90    0/0:3,0:3:9:0,9,134    0/0:5,0:5:15:0,15,214    0/0:3,0:3:9:0,9,130    0/0:4,0:4:9:0,9,135    0/0:4,0:4:12:0,12,173    0/0:3,0:3:9:0,9,130    0/0:5,0:5:12:0,12,180    0/0:3,0:3:6:0,6,90
176    .    G    .    .    PASS    AN=90;DP=170;NCC=1    GT:AD:DP    0/0:3:3    0/0:4:4    0/0:2:2    0/0:4:4    0/0:4:4    0/0:4:4    0/0:7:7    0/0:2:2    0/0:1:1    ./.    0/0:0:1    0/0:2:5    0/0:2:5    0/0:1:1    0/0:0:1    0/0:0:2    0/0:0:2    0/0:0:2    0/0:6:6    0/0:8:8    0/0:5:5    0/0:6:6    0/0:1:1    0/0:4:4    0/0:2:2    0/0:6:6    0/0:2:2    0/0:9:9    0/0:4:4    0/0:5:5    0/0:9:9    0/0:4:4    0/0:4:4    0/0:4:4    0/0:5:5    0/0:2:2    0/0:3:3    0/0:5:5    0/0:3:3    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:3:3
177    .    C    .    .    PASS    AN=88;DP=169;NCC=2    GT:AD:DP    0/0:3:3    0/0:4:4    0/0:2:2    0/0:4:4    0/0:4:4    0/0:4:4    0/0:7:7    0/0:2:2    0/0:1:1    ./.    0/0:0:1    0/0:0:3    0/0:0:4    ./.    0/0:0:1    0/0:0:2    0/0:0:2    0/0:0:2    0/0:6:6    0/0:8:8    0/0:5:5    0/0:6:6    0/0:1:1    0/0:4:4    0/0:2:2    0/0:6:6    0/0:2:2    0/0:9:9    0/0:4:4    0/0:6:6    0/0:10:10    0/0:4:4    0/0:5:5    0/0:4:4    0/0:5:5    0/0:2:2    0/0:3:3    0/0:5:5    0/0:3:3    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:3:3
178    .    T    .    .    PASS    AN=88;DP=169;NCC=2    GT:AD:DP    0/0:3:3    0/0:5:5    0/0:2:2    0/0:4:4    0/0:4:4    0/0:5:5    0/0:7:7    0/0:2:2    0/0:1:1    ./.    0/0:1:1    0/0:3:3    0/0:3:4    ./.    0/0:1:1    0/0:2:2    0/0:2:2    0/0:2:2    0/0:5:5    0/0:8:8    0/0:5:5    0/0:6:6    0/0:1:1    0/0:4:4    0/0:2:2    0/0:6:6    0/0:2:2    0/0:8:8    0/0:4:4    0/0:6:6    0/0:10:10    0/0:4:4    0/0:5:5    0/0:4:4    0/0:5:5    0/0:2:2    0/0:3:3    0/0:5:5    0/0:3:3    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:3:3
179    .    G    .    .    PASS    AN=88;DP=168;NCC=2    GT:AD:DP    0/0:2:2    0/0:5:5    0/0:2:2    0/0:4:4    0/0:4:4    0/0:5:5    0/0:7:7    0/0:2:2    0/0:1:1    ./.    0/0:1:1    0/0:3:3    0/0:4:4    ./.    0/0:1:1    0/0:2:2    0/0:2:2    0/0:2:2    0/0:5:5    0/0:7:7    0/0:6:6    0/0:6:6    0/0:1:1    0/0:4:4    0/0:2:2    0/0:6:6    0/0:2:2    0/0:8:8    0/0:4:4    0/0:6:6    0/0:10:10    0/0:4:4    0/0:5:5    0/0:4:4    0/0:5:5    0/0:2:2    0/0:3:3    0/0:5:5    0/0:3:3    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:3:3
180    .    C    .    .    PASS    AN=88;DP=170;NCC=2    GT:AD:DP    0/0:2:2    0/0:5:5    0/0:2:2    0/0:4:4    0/0:4:4    0/0:5:5    0/0:7:7    0/0:2:2    0/0:1:1    ./.    0/0:1:1    0/0:3:3    0/0:3:4    ./.    0/0:1:1    0/0:2:2    0/0:3:3    0/0:4:4    0/0:5:5    0/0:7:7    0/0:6:6    0/0:5:5    0/0:1:1    0/0:4:4    0/0:2:2    0/0:6:6    0/0:2:2    0/0:8:8    0/0:4:4    0/0:6:6    0/0:10:10    0/0:4:4    0/0:5:5    0/0:4:4    0/0:5:5    0/0:2:2    0/0:3:3    0/0:5:5    0/0:3:3    0/0:4:4    0/0:4:4    0/0:3:3    0/0:5:5    0/0:3:3

 

I apologize in advance, if I am doing something incorrect in the command line. If you would please advise, I'd very much appreciate it.

 

ADD REPLYlink written 5.2 years ago by suzanne.mcgaugh60

my bad: all your variants are 0/0 (homref) and not 'uncalled'. You could quickly remove those variants by piping the vcf to  https://github.com/lindenb/jvarkit/wiki/VCFFilterJS

java -jar dist/vcffilterjs.jar -e 'variant.getAlternateAlleles().size()>0'
ADD REPLYlink written 5.2 years ago by Pierre Lindenbaum129k

Thank you for your response. Things aren't quite working, I tried both with the -e commands and without. Please see below.

But just to be sure to clarify, I would like to keep all sites in the file (i.e. not remove the 0/0 sites), I simply want to recode the REF, ALT, and INFO appropriately to reflect the data currently in the "new" vcf once I remove the rouge individual. 

 

mcgaughs@labq01 [/panfs/roc/groups/14/mcgaughs/mcgaughs/tools/jvarkit] % java -jar /panfs/roc/groups/14/mcgaughs/mcgaughs/tools/jvarkit/dist-1.128/vcffilterjs.jar -f /panfs/roc/scratch/mcgaugh/VCFs/PASS_SNP_invariant_INDELTEST2.vcf -e 'variant.getAlternateAlleles().size()>0'

[INFO/VCFFilterJS] 2015-05-13 20:33:25 "Starting JOB at Wed May 13 20:33:25 CDT 2015 com.github.lindenb.jvarkit.tools.vcffilterjs.VCFFilterJS version=840f289630f04c24db877d06f90404ff7c2b9639  built=2015-05-13:20-05-18"

[INFO/VCFFilterJS] 2015-05-13 20:33:25 "Command Line args : -f /panfs/roc/scratch/mcgaugh/VCFs/PASS_SNP_invariant_INDELTEST2.vcf -e variant.getAlternateAlleles().size()>0"

[INFO/VCFFilterJS] 2015-05-13 20:33:25 "Executing as mcgaughs@labq01.msi.umn.edu on Linux 2.6.32-504.16.2.el6.x86_64 amd64; OpenJDK 64-Bit Server VM 1.7.0_79-mockbuild_2015_04_15_00_02-b00"

[SEVERE/VCFFilterJS] 2015-05-13 20:33:25 "both javascript file/expr are set"

[SEVERE/VCFFilterJS] 2015-05-13 20:33:25 "Initialization of VCFFilterJS failed."

[INFO/VCFFilterJS] 2015-05-13 20:33:25 "End JOB status=-1 [Wed May 13 20:33:25 CDT 2015] com.github.lindenb.jvarkit.tools.vcffilterjs.VCFFilterJS done. Elapsed time: 0.00 minutes."

[SEVERE/VCFFilterJS] 2015-05-13 20:33:25 "##### ERROR: return status = -1################"

mcgaughs@labq01 [/panfs/roc/groups/14/mcgaughs/mcgaughs/tools/jvarkit] % java -jar /panfs/roc/groups/14/mcgaughs/mcgaughs/tools/jvarkit/dist-1.128/vcffilterjs.jar -f /panfs/roc/scratch/mcgaugh/VCFs/PASS_SNP_invariant_INDELTEST2.vcf 

[INFO/VCFFilterJS] 2015-05-13 20:33:38 "Starting JOB at Wed May 13 20:33:38 CDT 2015 com.github.lindenb.jvarkit.tools.vcffilterjs.VCFFilterJS version=840f289630f04c24db877d06f90404ff7c2b9639  built=2015-05-13:20-05-18"

[INFO/VCFFilterJS] 2015-05-13 20:33:38 "Command Line args : -f /panfs/roc/scratch/mcgaugh/VCFs/PASS_SNP_invariant_INDELTEST2.vcf"

[INFO/VCFFilterJS] 2015-05-13 20:33:39 "Executing as mcgaughs@labq01.msi.umn.edu on Linux 2.6.32-504.16.2.el6.x86_64 amd64; OpenJDK 64-Bit Server VM 1.7.0_79-mockbuild_2015_04_15_00_02-b00"

[INFO/VCFFilterJS] 2015-05-13 20:33:39 "Compiling /panfs/roc/scratch/mcgaugh/VCFs/PASS_SNP_invariant_INDELTEST2.vcf"

[SEVERE/VCFFilterJS] 2015-05-13 20:33:39 "sun.org.mozilla.javascript.EvaluatorException: illegal character (<Unknown Source>#1)"

javax.script.ScriptException: sun.org.mozilla.javascript.EvaluatorException: illegal character (<Unknown Source>#1)

    at com.sun.script.javascript.RhinoScriptEngine.compile(RhinoScriptEngine.java:392)

    at com.github.lindenb.jvarkit.tools.vcffilterjs.AbstractVcfJavascript.initializeKnime(AbstractVcfJavascript.java:142)

    at com.github.lindenb.jvarkit.knime.AbstractKnimeApplication.mainWork(AbstractKnimeApplication.java:86)

    at com.github.lindenb.jvarkit.tools.vcffilterjs.VCFFilterJS.doWork(VCFFilterJS.java:152)

    at com.github.lindenb.jvarkit.util.AbstractCommandLineProgram.instanceMain(AbstractCommandLineProgram.java:496)

    at com.github.lindenb.jvarkit.util.AbstractCommandLineProgram.instanceMainWithExit(AbstractCommandLineProgram.java:510)

    at com.github.lindenb.jvarkit.tools.vcffilterjs.VCFFilterJS.main(VCFFilterJS.java:159)

Caused by: sun.org.mozilla.javascript.EvaluatorException: illegal character (<Unknown Source>#1)

    at sun.org.mozilla.javascript.DefaultErrorReporter.runtimeError(DefaultErrorReporter.java:109)

    at sun.org.mozilla.javascript.DefaultErrorReporter.error(DefaultErrorReporter.java:96)

    at sun.org.mozilla.javascript.Parser.addError(Parser.java:146)

    at sun.org.mozilla.javascript.TokenStream.getToken(TokenStream.java:825)

    at sun.org.mozilla.javascript.Parser.peekToken(Parser.java:172)

    at sun.org.mozilla.javascript.Parser.parse(Parser.java:384)

    at sun.org.mozilla.javascript.Parser.parse(Parser.java:359)

    at sun.org.mozilla.javascript.Context.compileImpl(Context.java:2370)

    at sun.org.mozilla.javascript.Context.compileReader(Context.java:1321)

    at sun.org.mozilla.javascript.Context.compileReader(Context.java:1293)

    at com.sun.script.javascript.RhinoScriptEngine.compile(RhinoScriptEngine.java:388)

    ... 6 more

[SEVERE/VCFFilterJS] 2015-05-13 20:33:39 "Initialization of VCFFilterJS failed."

[INFO/VCFFilterJS] 2015-05-13 20:33:39 "End JOB status=-1 [Wed May 13 20:33:39 CDT 2015] com.github.lindenb.jvarkit.tools.vcffilterjs.VCFFilterJS done. Elapsed time: 0.00 minutes."

[SEVERE/VCFFilterJS] 2015-05-13 20:33:39 "##### ERROR: return status = -1################"

ADD REPLYlink written 5.2 years ago by suzanne.mcgaugh60

you're not using those tools the correct way. you should do something like:

cat your.vcf | vcfcutsamples [options] | vcffilterjs [options] > output.vcf

Furthermore, if all your sites are 0/0 or undefined (./.) there will be no ALT alleles.

 

ADD REPLYlink written 5.2 years ago by Pierre Lindenbaum129k

Thank you for your response. I have your code working as above. Unfortunately, this is not the solution I wanted, as it removes the invariant sites.

I still want all invariant sites included in my file. VCFtools and VCFlib leave a signature of the removed sample in the ALT and REF columns, even if  the remaining individuals do not have the alternative allele and all remaining individuals are 0/0 or ./.  These tools may still provide something other than '.' in the ALT column. I simply want to remove one individual from a vcf file, and have the ALT, REF, and INFO recoded as if I had made the original vcf file without that individual. 

vcfcutsamples.jar almost does this, but it leaves the original deletions relative to the reference from the individual that was removed denoted in the REF tab (as shown in my examples above, everyone is 0/0 or ./. but the REF allele shows the deletion in the sample that was removed). I can work with that and just write some code to deal with it downstream.

I'd be curious to know if there is actually software out there that recodes everything properly, though because it would be easier overall.

Thank you for your help. 

 

 

ADD REPLYlink modified 5.2 years ago • written 5.2 years ago by suzanne.mcgaugh60
1
gravatar for Pierre Lindenbaum
5.2 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

my tool https://github.com/lindenb/jvarkit/wiki/VcfCutSamples remove the unused ALT alleles:

$ gunzip -c input.vcf.gz  |\
java -jar dist/vcffilterjs.jar -e 'variant.getAlternateAlleles().size()>2'  |\
java -jar dist/vcfcutsamples.jar -S B00GG81  | grep -v '#' | cut -f 4,5,10 | head


CTTTTT    CT    0/1:0,1,2,0,0:7:0:132,47,96,0,20,15,82,45,0,73,52,35,0,51,43
TGGG    TGG,T    2/1:0,1,1,1:9:5:49,33,55,5,30,65,38,27,0,34
CAAAAAAAAAAA    C,CAAAAAAAAA    1/2:0,0,5,1,0:10:22:630,645,689,32,42,177,516,517,0,499,586,588,22,515,570
CGTGTGT    .    0/0:2,0,0,0:4:6:0,6,77,6,83,166,6,80,90,84
TTGTG    TTGTGTGTG    1/1:0,0,0,0,1:3:3:46,32,29,52,35,128,52,35,106,101,3,3,6,6,0
TG    TGGGGG    1/1:0,0,0,1,0:2:3:46,17,14,49,17,57,3,3,3,0,13,12,13,3,10
GTCTC    GTC    1/1:0,0,3,0:5:9:128,132,138,9,9,0,137,147,9,188
CAAA    CA    0/1:3,1,0,0:14:14:14,0,141,35,165,294,16,79,110,84
AGCCGCCGCC    .    0/0:46,0,0,0:92:99:0,188,6588,138,4981,4841,138,3895,3831,3706
CAAA    CAA    0/1:7,3,6,2,18:80:71:350,366,682,275,439,1309,453,695,1472,2289,0,71,394,532,376

 

ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by Pierre Lindenbaum129k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1796 users visited in the last hour