How to subset a GVCF using GATK SelectVariants
1
0
Entering edit mode
22 months ago
ttom ▴ 220

Hi All,

I am trying to subset a GVCF with multiple samples to a GVCF with smaller number of samples and I am not getting the results as expected.

First command used

gatk SelectVariants --java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true' -R Homo_sapiens_assembly38.fasta --variant combined.g.vcf --sample-name subset_samples.txt -O subset_combined.g.vcf

And the error received was:. Even though all the samples listed in the file subset_samples.txt are present in the input VCF

   A USER ERROR has occurred: Bad input: Samples entered on command line (through -sf or -sn) that are not present in the VCF
   A list of these samples: subset_samples.txt

    To ignore these samples, run with --allow-nonoverlapping-command-line-samples

Second command used:

gatk SelectVariants --java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true' -R Homo_sapiens_assembly38.fasta --variant combined.g.vcf --sample-name subset_amples.txt -O subset_combined.g.vcf--allow-nonoverlapping-command-line-samples

And the issue:

The output VCF still has all sample names

I am not sure, what I am missing in the commands to get the right output

GATK SelectVariants • 1.4k views
ADD COMMENT
0
Entering edit mode

are you sure it's safe reduce the number of samples from a g.vcf file (!= vcf) ?

ADD REPLY
1
Entering edit mode
22 months ago

https://gatk.broadinstitute.org/hc/en-us/articles/360037055952-SelectVariants#--sample-name

--sample-name / -sn This argument can be specified multiple times in order to provide multiple sample names, or to specify the name of one or more files containing sample names. File names must use the extension ".args", and the expected file format is simply plain text with one sample name per line. Note that sample exclusion takes precedence over inclusion, so that if a sample is in both lists it will be excluded.

did you try subset_samples.args instead of subset_samples.txt

ADD COMMENT
0
Entering edit mode

Yes, the wrong file extension was the problem. It works with subset_samples.args Thank you !!

ADD REPLY

Login before adding your answer.

Traffic: 1979 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6