Filter SNP's for no more than 3 mismatches per 10 bp window
1
0
Entering edit mode
3.8 years ago

Hi

The version of my JAVA is: 1.8 The version of my GATK is: 3.7.0 My OS is: Ubuntu 14.04.2 LTS My processor is: Intel Core i5-4440 CPU @ 3.10GHz × 4

I have been successful in creating a vcf file using Unified Genotyper (GATK). This Vcf file was obtained from a merged bam. The merged bam was obtained from 10 different bam files.(10 different samples but of same organism).

I now want to get rid of my false-positives For this I want to get rid of SNP's with *

no more than 3 mismatches per 10 bp window.


I am aware of GATK filtervariants which can do this but due to some unidentified error in GATK its showing impossibly high run time to do so. (1000 weeks!!)

Is there any alternative way by which I can do this? or has anyone faced the same problem with GATK before?

Any help will be highly valued.

SNP filter mismatch gatk • 1.2k views
ADD COMMENT
1
Entering edit mode
3.8 years ago

I've written this: http://lindenb.github.io/jvarkit/VariantsInWindow some times ago. It should be fast, but I've not used it much= use with care.

ADD COMMENT
0
Entering edit mode

Thanks a lot Pierre for your prompt reply.

When I try to make vcfwindowvariants. It gives the error: make: * No rule to make target 'vcfwindowvariants'. Stop.

In the tools directory : /jvarkit/src/main/java/com/github/lindenb/jvarkit/tools I cannot locate vcfwindowvariants.

Please help.

ADD REPLY
1
Entering edit mode

you're right ! sorry, the documentation was wrong, I've updated it: https://github.com/lindenb/jvarkit/commit/d93b31743b4eaa7808ca8f02e89174c288c574cd

make variantsinwindow
ADD REPLY

Login before adding your answer.

Traffic: 2033 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6