Hi,
I need help with awk commands, I have 4 samples in my vcf files, so field $10,$11,$12,$13 are the fields which have the genotype for each row, now I want remove the rows where in any of the any rows at least one sample is showing the genotype ./. and want to print the rest in another vcf file, can this be done? Am not so familiar with awk substr. Any assistance? Below is the example of my vcf file, it does not have any header.
chr3    75787186    rs150410646    C    T    53.89    .    AC=4;AF=0.500;AN=8;BaseQRankSum=-4.341;DB;DP=424;Dels=0.00;FS=0.000;HaplotypeScore=2.2684;MLEAC=4;MLEAF=0.500;MQ=6.41;MQ0=371;MQRankSum=-3.553;QD=0.13;ReadPosRankSum=-1.007    GT:AD:DP:GQ:PL    0/1:63,21:80:48:48,0,127    0/1:25,5:29:21:21,0,64    0/1:142,41:174:10:10,0,94    0/1:95,31:120:6:6,0,120
chr3    75787576    rs141348932    A    G    61.87    .    AC=2;AF=1.00;AN=2;DB;DP=195;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;MLEAC=2;MLEAF=1.00;MQ=4.17;MQ0=189;QD=0.69    GT:AD:DP:GQ:PL    ./.    ./.    1/1:68,22:86:9:87,9,0    ./.
chr3    75787583    rs144348996    A    G    100.62    .    AC=2;AF=1.00;AN=2;DB;DP=203;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;MLEAC=2;MLEAF=1.00;MQ=4.33;MQ0=197;QD=1.12    GT:AD:DP:GQ:PL    ./.    ./.    1/1:65,25:86:12:126,12,0    ./.
chr3    75787584    rs151027881    C    A    93.62    .    AC=2;AF=1.00;AN=2;DB;DP=203;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;MLEAC=2;MLEAF=1.00;MQ=4.33;MQ0=197;QD=1.04    GT:AD:DP:GQ:PL    ./.    ./.    1/1:64,26:86:12:119,12,0    ./.
chr3    75787620    rs145606249    T    C    153.42    .    AC=2;AF=1.00;AN=2;DB;DP=224;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;MLEAC=2;MLEAF=1.00;MQ=4.38;MQ0=217;QD=1.70    GT:AD:DP:GQ:PL    ./.    ./.    1/1:52,38:86:18:179,18,0    ./.
chr3    75787728    rs111389701    C    T    643.34    .    AC=8;AF=1.00;AN=8;DB;DP=186;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;MLEAC=8;MLEAF=1.00;MQ=10.21;MQ0=140;QD=3.46    GT:AD:DP:GQ:PL    1/1:0,32:32:3:28,3,0    1/1:0,23:23:9:82,9,0    1/1:0,82:82:51:503,51,0    1/1:0,49:49:6:55,6,0
I want to remove the rows where if any of the column $10,$11,$12,$13 is having ./. no genotype then I want to eliminate those rows. Sorry for the formatting, I am not being able to get the correct format. Any suggestions?
Why don't you use vcftools with the
--phaseoption?Thanks a lot,
I have figured it out with the below command
grep -vwwith appropriate escaping would do it as well