Soft filtering of SNPs in a list
2
0
Entering edit mode
4.0 years ago
beanavarro85 ▴ 10

Hi all!

I am looking for a way to filter a list of SNPs from a VCF file.

I know that, with GATK SelectVariants or VCFtools you can exclude a list of SNPs from your VCF, however what I want is to soft-filter them (add filter info in the FILTER column of the VCF).

Any ideas? Thanks!

SNP • 2.0k views
ADD COMMENT
0
Entering edit mode

A small PyVCF snippet should help. I am not sure if VariantFilter just adds "filter_name" or also excludes variants.

ADD REPLY
0
Entering edit mode

I believe VariantFiltration adds "filter_name", but I am unsure on how provide a SNP list as a filter expression. I have never used PyVCF, I'll check that out.

ADD REPLY
0
Entering edit mode
4.0 years ago

GATK VariantFiltration https://gatk.broadinstitute.org/hc/en-us/articles/360037434691-VariantFiltration

I refactored the program I wrote for: How to get 1000 Genomes data in bulk?

it now takes a new option --filter . see http://lindenb.github.io/jvarkit/Biostar332826.html

e.g:

java -jar dist/biostar332826.jar --filter "MYFILTERNAME" -r ids.txt sites.vcf.gz 
ADD COMMENT
0
Entering edit mode
4.0 years ago

Hello,

bcftools filter with the -s argument is what you are looking for.

bcftools filter -e 'ID=@rsid.txt' -s 'MyFilter' input.vcf

fin swimmer

ADD COMMENT

Login before adding your answer.

Traffic: 2517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6