Question: How To Select A Private Snp With Gatk From A Multisample Vcf File
0
gravatar for William
7.9 years ago by
William4.7k
Europe
William4.7k wrote:

I ran in to the situation now a couple of times that I need to extract a set of private SNPs from a multisample VCF file.

It is possible with vcf-contrast from vcf-tools:

vcf-contrast +sample1 -sample2 -sample3 -n input.vcf > private sample1.vcf

vcf-contrast -sample1 +sample2 -sample3 -n input.vcf > private sample2.vcf

vcf-contrast -sample1 -sample2 +sample3 -n input.vcf > private sample3.vcf

Surely this must be possible with GATK. Does anyone know how to do this with GATK. I ask because I would like to stay with 1 tool package.

Maybe it is somewhere in the SelectVariants? The --discordance option looked promissing but there is something about that the samples should be the same? Or is it possible to write another variant walker or a JEXL expression?

http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_variantutils_SelectVariants.html#--concordance

http://gatkforums.broadinstitute.org/discussion/1255/what-are-jexl-expressions-and-how-can-i-use-them-with-the-gatk

gatk snp • 3.5k views
ADD COMMENTlink modified 4.8 years ago by Biostar ♦♦ 20 • written 7.9 years ago by William4.7k
0
gravatar for Pierre Lindenbaum
7.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum131k wrote:

I think you'll have to use SelectVariants with a JEXL expression: see http://gatkforums.broadinstitute.org/discussion/1255/what-are-jexl-expressions-and-how-can-i-use-them-with-the-gatk

someting like (not tested)

java -Xmx4g -jar GenomeAnalysisTK.jar -T SelectVariants -R seq.fasta --variant my.vcf -select '
 vc.getGenotype("sampel1")!=null &&
 vc.getGenotype("sample2")==null &&
 vc.getGenotype("sample3")==null  
'
ADD COMMENTlink written 7.9 years ago by Pierre Lindenbaum131k
0
gravatar for William
7.9 years ago by
William4.7k
Europe
William4.7k wrote:

I got the answer on the GATK forum:

With the SelectVariants walker you have to restrict to biAllelic SNPs and then select with a JEXL expression for all the SNPs were AC==1. AC equals the times that the alternative allele is found in all the samples.

java -jar GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar -T SelectVariants -R reference.fa -V input.vcf -o biAllelicPrivate.vcf --restrictAllelesTo BIALLELIC -select "AC==1"
ADD COMMENTlink written 7.9 years ago by William4.7k
0
gravatar for jigarnt
5.2 years ago by
jigarnt30
Canada
jigarnt30 wrote:

Hi,

I am trying to do the same analyses and I want to know the significance f Bi-allelic SNPs??

ADD COMMENTlink written 5.2 years ago by jigarnt30

ask this as a new question please.

ADD REPLYlink written 5.2 years ago by Pierre Lindenbaum131k

Hi Pierre,

I have posted a question in the forum regarding the same. Could you have a look at it.?

ADD REPLYlink written 5.2 years ago by jigarnt30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1765 users visited in the last hour