How to select lines (filter) in vcf files that have same genotypes for all the samples?
1
0
Entering edit mode
4.0 years ago
kirannbishwa01 ★ 1.3k

Is there a tool or method for filtering the vcf file in the following manner.

I want to run a multisample vcf file to select the lines/site that have same genotypes in all the samples.

Thanks,

vcf pyvcf genotype genome sequence • 1.5k views
0
Entering edit mode

Hello kirannbishwa01!

It appears that your post has been cross-posted to another site: http://gatkforums.broadinstitute.org/gatk/discussion/comment/39095

This is typically not recommended as it runs the risk of annoying people in both communities.

0
Entering edit mode

Thanks for the update. I understand the problem. I had to repost the question in this forum because, I had not been getting the solution to the problem (sometimes no answer and sometimes not the right one), and it is just beyond patience to wait for the answer for couple days when you need to move on with your data analyses.

I would have deleted the question on GATK forum, but unlike in Biostars that's not possible with GATK forum, once its there its there unless the admin deletes it. I hope you understand that it was not something intended.

6
Entering edit mode
4.0 years ago

Using vcffilterjs http://lindenb.github.io/jvarkit/VCFFilterJS.html

java -jar dist/vcffilterjs.jar -e 'function accept(vc) {for(var i=1;i<vc.getNSamples();i++) if(!vc.getGenotype(0).sameGenotype(vc.getGenotype(i))) return false; return true;}accept(variant); ' input.vcf

0
Entering edit mode

Hi @Pierre: Thanks for the answer. Btw, you suggested this tool yesterday in another question. I couldn't find the proposed jar file but only java file. I tried to find it but couldn't and had to let go. Can you please provide a link for the jar file?

Thanks much,

0
Entering edit mode

$git clone "https://github.com/lindenb/jvarkit.git"$ cd jvarkit

\$ make vcffilterjs

The *.jar libraries are not included in the main jar file, so you shouldn’t move them (https://github.com/lindenb/jvarkit/issues/15#issuecomment-140099011 ). The required libraries will be downloaded and installed in the dist directory.

0
Entering edit mode

@Pierre. Thanks it worked. I tried to read your script to see how I can apply any modified changes, but couldn't. So, if I want to select the line that have same GT, but want to relax a little bit when the GT isnot called (./.). How would I do it? Say, I can accept 1/1, 1/1, 1/1, 1/1 for 4 samples when 1 other sample is ./. (no call).

Thanks,