I am new to use snakemake and now I am able to apply it in the GATK GenomicsDBImport steps combining 500 genotype vcf files. now I have 200 more genotype vcf files to combine so I tried the GenomicsDBImport genomicsdb-update-workspace-path argument, and I have an error and the script will delete my previous database as well. I think I can combine them all in once by combine700 genotype vcf files together, but I would like to know how to increment the gvcf files to the database, below is my script.
This one is how I updated the database:
  # Snakefile
  import os
  # Define the path to the GATK binary and Java options
  GATK_PATH = "/opt/conda/envs/gvcf/bin/gatk"
  JAVA_OPTIONS = "-Xmx16g -Xms16g -XX:ParallelGCThreads=8"
  # Define the sample name map file
  SAMPLE_MAP = "batch1_1.tsv"
  # Define the temporary directory
  TMP_DIR = "/home/tmp"
  # Define the list of chromosomes to process
  CHROMOSOMES = ["21", "22"]
  rule all:
      input:
          expand("/data2/chr{chrom}_db", chrom=CHROMOSOMES)
  rule genomics_db_import:
      input:
          sample_map=SAMPLE_MAP,
      output:
          directory("/data2/chr{chrom}_db"),
      params:
          gatk=GATK_PATH,
          java_options=JAVA_OPTIONS,
          chrom="{chrom}",
          batch_size=50,
          tmp_dir=TMP_DIR,
          reader_threads=20,
          consolidate=True,
      shell:
          "{params.gatk} --java-options '{params.java_options}' \
          GenomicsDBImport \
          --genomicsdb-update-workspace-path {output} \
          --batch-size {params.batch_size} \
          -L chr{params.chrom} \
          --sample-name-map {input.sample_map} \
          --tmp-dir {params.tmp_dir} \
          --reader-threads {params.reader_threads} \
          --consolidate {params.consolidate}"
and then i ran it and update chr21 and chr22 directory:
  snakemake -s genomicdbUpdate.smk --cores 16 & 
The error message is:
A USER ERROR has occurred: We require an existing valid workspace when incremental import is set
  ***********************************************************************
  Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack                        trace.
  09:12:09.255 INFO  IntervalArgumentCollection - Processing 46709983 bp from intervals
  09:12:09.256 INFO  GenomicsDBImport - Done initializing engine
  09:12:09.257 INFO  GenomicsDBImport - Shutting down engine
  [July 5, 2024 at 9:12:09 AM UTC] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 0.01 minutes.
  Runtime.totalMemory()=17179869184
  ***********************************************************************
  A USER ERROR has occurred: We require an existing valid workspace when incremental import is set
  ***********************************************************************
  Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack                        trace.
  [Fri Jul  5 09:12:09 2024]
  Error in rule genomics_db_import:
      jobid: 2
      input: batch1_1.tsv
      output: /data2/chr22_db
      shell:
          /opt/conda/envs/gvcf/bin/gatk --java-options '-Xmx16g -Xms16g -XX:ParallelGCThreads=8'         GenomicsDBImport         --genomi                       csdb-update-workspace-path /data2/chr22_db         --batch-size 50         -L chr22         --sample-name-map batch1_1.tsv         --tmp                       -dir /home/tmp         --reader-threads 20         --consolidate True
          (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
  Removing output files of failed job genomics_db_import since they might be corrupted:
  /data2/chr22_db
  [Fri Jul  5 09:12:09 2024]
  Error in rule genomics_db_import:
      jobid: 1
      input: batch1_1.tsv
      output: /data2/chr21_db
      shell:
          /opt/conda/envs/gvcf/bin/gatk --java-options '-Xmx16g -Xms16g -XX:ParallelGCThreads=8'         GenomicsDBImport         --genomi                       csdb-update-workspace-path /data2/chr21_db         --batch-size 50         -L chr21         --sample-name-map batch1_1.tsv         --tmp                       -dir /home/tmp         --reader-threads 20         --consolidate True
          (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
  Removing output files of failed job genomics_db_import since they might be corrupted:
  /data2/chr21_db
  Shutting down, this might take some time.
  Exiting because a job execution failed. Look above for error message
  Complete log: .snakemake/log/2024-07-05T091205.769438.snakemake.log
I can run the shell script independently without error, however I put it in snakemake it cannot recognize my exisiting database (they are in same name and directory), is there anyone have experience, please advice, thanks.