Entering edit mode
10.1 years ago
slees.nt
•
0
I have a VCF file containing two individuals, and I used vcftools to keep only one individual. However, the row number of the VCF file did not change, with many rows showing "0/0" genotype. I wondered if I could use vcftools to remove these rows? Below is an example of such rows:
chr10 12999368 . T A 4.23 . . GT:PL:DP:GQ 0/0:0,12,138:4:16
It's likely that a simple
grep -vw "0/0" foo.vcf > foo.filtered.vcf
would work, but you'd have to show a few lines to make sure. If nothing else, you can do this with awk easily enough.Yes, awk should work. But vcftools seems to have GENOTYPE FILTERING OPTIONS such as "--remove-filtered-geno <string>", however, I couldn't get it work. Is there any idea if vcftools can do the job?
You could remove the example line with
--remove-filtered-geno
, but you'd also remove every other line with a missing value (.
), which is probably not what you want. Just use Pierre's awk solution, it gets the job done and that's what matters.Thanks, I agree. But I am still curious how to make "--remove-filtered-geno" work? I tried "--remove-filtered-geno 0/0", but nothing was filtered out at all.
When you use
--remove-filtered-geno
, it looks at the 7th column and removes things that match.0/0
won't be in that column, so nothing would be changed. You might be able to--remove-filtered-geno .
, but that's unlikely to be a good solution.