GATK GenomicsDBImport for consolidate GVCF
0
0
Entering edit mode
28 days ago

Hi, I have assembled a genome as reference with 36 fragments in the file (a file called contigs.fa)

NODE_2_length_146463_cov_55.024967
NODE_3_length_259_cov_163.339767
NODE_4_length_3636_cov_339.770905
NODE_5_length_78387_cov_47.698547
NODE_6_length_9697_cov_53.580593
NODE_7_length_27613_cov_55.561802
NODE_8_length_60671_cov_49.392410
.............. Total 36 fragments

I would like to consolidate GVCFs using GenomicsDBImport

../00.bin/gatk-4.2.0.0/gatk --java-options "-Xmx1g -Xms1g -DGATK_STACKTRACE_ON_USER_EXCEPTION=true"  GenomicsDBImport --sample-name-map raw_vcf_list.txt --genomicsdb-workspace-path rawAssignment3.GDBI  --intervals ../01.data/contigs.fa


raw_vcf_list.txt is the sample name list generated beforehand. However, error message pop out.

A USER ERROR has occurred: Couldn't read file ../01.data/contigs.fa. Error was: The file ../01.data/contigs.fa exists, but does not contain Features (ie., is not in a supported Feature file format such as vcf, bcf, bed, or interval_list), and does not have one of the supported interval file extensions ([.list, .intervals]). Please rename your file with the appropriate extension. If ../01.data/contigs.fa is NOT supposed to be a file, please move or rename the file at location /home/sandra/Downloads/Resequencing_Assigment/03.variationCalling/../01.data/contigs.fa

org.broadinstitute.hellbender.exceptions.UserException\$CouldNotReadInputFile: Couldn't read file ../01.data/contigs.fa. Error was: The file ../01.data/contigs.fa exists, but does not contain Features (ie., is not in a supported Feature file format such as vcf, bcf, bed, or interval_list), and does not have one of the supported interval file extensions ([.list, .intervals]). Please rename your file with the appropriate extension. If ../01.data/contigs.fa is NOT supposed to be a file, please move or rename the file at location /home/sandra/Downloads/Resequencing_Assigment/03.variationCalling/../01.data/contigs.fa

May I know how to modified the script so that I can using GenomicsDBImport to consolidate GVCFs and carry on the analysis for joint call cohort? Thanks in advance.

GATK GenomicsDBImport consolidate • 222 views
1
Entering edit mode

in GATK the option -L is designed to define an interval (a bed file, a interval_file, etc..) not , as far as I can see, a list of of "fragments" Furthermore, the suffix .fa should be reserved for FASTA files.

0
Entering edit mode

thank you