vcfeval Error: No sample name provided but calls is a multi-sample VCF
2
0
Entering edit mode
5.8 years ago
simo017 ▴ 10

Hi, I am trying to compare a group of vcf files alltegether against the GIAB vcf file, whene I compared them against the GIAB vcf file one by one everything went fine but when I tried to merge them and then compare the merged file agianst the GIAB vcf I got an error the say: No sample name provided but calls is a multi-sample VCF.

I found a similar post here but with no clue:

Here is my commands: to merge them:

 ./rtg vcfmerge -o merged_extra.vcf.gz full_variant_table_3.vcf.gz full_variant_table_4.vcf.gz full_variant_table_5.vcf.gz full_variant_table_6.vcf.gz full_variant_table_7.vcf.gz full_variant_table_8.vcf.gz

for comparaison:

./rtg vcfeval -t /home/variants/1000g_v37_phase2.sdf -b /home/rtg-tools-3.8.4/GIAB_MY_region.vcf.gz -c /home/variants/vcf_files/merged_extra.vcf.gz -o /home/variants/vcf_result

this gives: Error: No sample name provided but calls is a multi-sample VCF.

I tried to use --sample=<calls_samplename> parameters however I am still getting the same error:

./rtg vcfeval -t /home/variants/1000g_v37_phase2.sdf -b /home/rtg-tools-3.8.4/GIAB_MY_region.vcf.gz -c /home/variants/vcf_files/merged_extra.vcf.gz -o /home/variants/vcf_result --sample GIAB_MY_region,merged_extra

Any help is very appreciated

vcfeval vcfmerge vcf genome GIAB • 3.6k views
ADD COMMENT
2
Entering edit mode
5.8 years ago
geocarvalho ▴ 350

Your VCF header label has the same name of the file? For example: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT GIAB_MY_region #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT merged_extra

or is another? Because the label expected for the --sample argument is the label present in the VCF header.

best regards.

ADD COMMENT
0
Entering edit mode
5.8 years ago
simo017 ▴ 10

Hi geocarvalho, Thank you for your answer.

ADD COMMENT
0
Entering edit mode

About your other comment. If you have a VCF with mulitple samples you must choose one at a time. For example:

  • For a header: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample_1 Sample_2 Sample_3

command: ./rtg vcfeval --baseline=reference.vcf.gz --bed-regions=roi.bed -c merged_extra.vcf.gz -o multi_vcf -t /path/to/genome.fasta.sdf --sample=Reference_label,Sample_2

  • But if you have a multi VCF with the same sample you should rename your initial VCFs to the same label. For example:

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample_1

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample_2

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample_3

Should be:

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample_1

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample_1

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample_1

To re-run the rtg vcfmerge

  • Or you can simple use GATK CombineVariants with the genotypeMergeOptions PRIORITIZE parameter (it's what I recommend) and isn't necessary change the VCF's header sample label.

  • And then you can run rtg vcfeval ./rtg vcfeval --baseline=reference.vcf.gz --bed-regions=roi.bed -c merged_extra.vcf.gz -o multi_vcf -t /path/to/genome.fasta.sdf --sample=Reference_label,Sample_1

ADD REPLY

Login before adding your answer.

Traffic: 1899 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6