index fasta file for HaplotypeCaller
Entering edit mode
11 months ago
Mojtaba • 0

I am going to work with HaplotypeCaller to realign indels after some processes. HaplotypeCaller asks me to define an indexed fasta reference <ref.fasta.fai> and offers me to make it by samtools faidx. When I use samtools faidx its work is very short and gives me a fasta.fai file. But when I define it to haplotypecaller, it gives me a new error. Anyone HELP Me.

    (base) mojtaba@Mojtaba:~/Desktop/BAM$ java -jar /home/mojtaba/Desktop/BAM/gatk- HaplotypeCaller -R GRCh38_latest_genomic.fna -I marked_duplicates.sam -O realigned.vcf.gz -bamout completelyprocessedSAM.bam
15:50:37.491 INFO  NativeLibraryLoader - Loading from jar:file:/home/mojtaba/Desktop/BAM/gatk-!/com/intel/gkl/native/
15:50:37.690 INFO  HaplotypeCaller - ------------------------------------------------------------
15:50:37.693 INFO  HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.5.0.0
15:50:37.693 INFO  HaplotypeCaller - For support and documentation go to
15:50:37.694 INFO  HaplotypeCaller - Executing as mojtaba@Mojtaba on Linux v6.5.0-21-generic amd64
15:50:37.694 INFO  HaplotypeCaller - Java runtime: OpenJDK 64-Bit Server VM v17.0.10+7-Ubuntu-122.04.1
15:50:37.694 INFO  HaplotypeCaller - Start Date/Time: March 7, 2024 at 3:50:37 PM IRST
15:50:37.694 INFO  HaplotypeCaller - ------------------------------------------------------------
15:50:37.694 INFO  HaplotypeCaller - ------------------------------------------------------------
15:50:37.696 INFO  HaplotypeCaller - HTSJDK Version: 4.1.0
15:50:37.696 INFO  HaplotypeCaller - Picard Version: 3.1.1
15:50:37.696 INFO  HaplotypeCaller - Built for Spark Version: 3.5.0
15:50:37.697 INFO  HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
15:50:37.697 INFO  HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
15:50:37.697 INFO  HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
15:50:37.698 INFO  HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
15:50:37.698 INFO  HaplotypeCaller - Deflater: IntelDeflater
15:50:37.698 INFO  HaplotypeCaller - Inflater: IntelInflater
15:50:37.699 INFO  HaplotypeCaller - GCS max retries/reopens: 20
15:50:37.699 INFO  HaplotypeCaller - Requester pays: disabled
15:50:37.700 INFO  HaplotypeCaller - Initializing engine
15:50:37.704 INFO  HaplotypeCaller - Shutting down engine
[March 7, 2024 at 3:50:37 PM IRST] done. Elapsed time: 0.00 minutes.

A USER ERROR has occurred: Fasta index file file:///home/mojtaba/Desktop/BAM/GRCh38_latest_genomic.fna.fai for reference file:///home/mojtaba/Desktop/BAM/GRCh38_latest_genomic.fna does not exist. Please see for help creating it.

Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.

    mojtaba@Mojtaba:~/Desktop/BAM$ samtools faidx GRCh38_latest_genomic.fna
    mojtaba@Mojtaba:~/Desktop/BAM$ java -jar /home/mojtaba/Desktop/BAM/gatk- HaplotypeCaller -R GRCh38_latest_genomic.fasta.fai -I marked_duplicates.sam -O realigned.vcf.gz -bamout completelyprocessedSAM.bam

15:54:54.231 INFO  NativeLibraryLoader - Loading from jar:file:/home/mojtaba/Desktop/BAM/gatk-!/com/intel/gkl/native/
15:54:54.458 INFO  HaplotypeCaller - ------------------------------------------------------------
15:54:54.461 INFO  HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.5.0.0
15:54:54.462 INFO  HaplotypeCaller - For support and documentation go to
15:54:54.462 INFO  HaplotypeCaller - Executing as mojtaba@Mojtaba on Linux v6.5.0-21-generic amd64
15:54:54.462 INFO  HaplotypeCaller - Java runtime: OpenJDK 64-Bit Server VM v17.0.10+7-Ubuntu-122.04.1
15:54:54.463 INFO  HaplotypeCaller - Start Date/Time: March 7, 2024 at 3:54:54 PM IRST
15:54:54.463 INFO  HaplotypeCaller - ------------------------------------------------------------
15:54:54.463 INFO  HaplotypeCaller - ------------------------------------------------------------
15:54:54.465 INFO  HaplotypeCaller - HTSJDK Version: 4.1.0
15:54:54.465 INFO  HaplotypeCaller - Picard Version: 3.1.1
15:54:54.465 INFO  HaplotypeCaller - Built for Spark Version: 3.5.0
15:54:54.466 INFO  HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
15:54:54.466 INFO  HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
15:54:54.467 INFO  HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
15:54:54.467 INFO  HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
15:54:54.468 INFO  HaplotypeCaller - Deflater: IntelDeflater
15:54:54.468 INFO  HaplotypeCaller - Inflater: IntelInflater
15:54:54.468 INFO  HaplotypeCaller - GCS max retries/reopens: 20
15:54:54.468 INFO  HaplotypeCaller - Requester pays: disabled
15:54:54.469 INFO  HaplotypeCaller - Initializing engine
15:54:54.474 INFO  HaplotypeCaller - Shutting down engine
[March 7, 2024 at 3:54:54 PM IRST] done. Elapsed time: 0.01 minutes.
java.lang.IllegalArgumentException: File is not a supported reference file type: /home/mojtaba/Desktop/BAM/GRCh38_latest_genomic.fasta.fai
    at htsjdk.samtools.reference.ReferenceSequenceFileFactory.lambda$getFastaExtension$0(
    at java.base/java.util.Optional.orElseGet(
    at htsjdk.samtools.reference.ReferenceSequenceFileFactory.getFastaExtension(
    at htsjdk.samtools.reference.ReferenceSequenceFileFactory.getDefaultDictionaryForReferenceSequence(
    at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.checkFastaPath(
    at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(
    at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(
    at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(
    at org.broadinstitute.hellbender.engine.ReferenceFileSource.<init>(
    at org.broadinstitute.hellbender.engine.ReferenceDataSource.of(
    at org.broadinstitute.hellbender.engine.GATKTool.initializeReference(
    at org.broadinstitute.hellbender.engine.GATKTool.onStartup(
    at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.onStartup(
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(
    at org.broadinstitute.hellbender.Main.mainEntry(
    at org.broadinstitute.hellbender.Main.main(
Samtools HaplotypeCaller GATK faidx • 1.1k views
Entering edit mode

A USER ERROR has occurred: Fasta index file file:///home/mojtaba/Desktop/BAM/GRCh38_latest_genomic.fna.fai for reference file:///home/mojtaba/Desktop/BAM/GRCh38_latest_genomic.fna does

please, what is the output of

ls -lah /home/mojtaba/Desktop/BAM/GRCh38_latest_genomic.*
Entering edit mode
11 months ago
Arton ▴ 20

You should use "-R GRCh38_latest_genomic.fasta" instead of "-R GRCh38_latest_genomic.fasta.fai". "-R" parameter is for the reference file. GATK automatically detects the index file.

Entering edit mode

the gatk log shows OP first used java -jar /home/mojtaba/Desktop/BAM/gatk- HaplotypeCaller -R GRCh38_latest_genomic.fna -I marked_duplicates.sam -O realigned.vcf.gz -bamout completelyprocessedSAM.bam

Entering edit mode

This is what he did next:

$ samtools faidx GRCh38_latest_genomic.fna

$ java -jar /home/mojtaba/Desktop/BAM/gatk- HaplotypeCaller -R GRCh38_latest_genomic.fasta.fai -I marked_duplicates.sam -O realigned.vcf.gz -bamout completelyprocessedSAM.bam

Entering edit mode

Thank you. Its working.


Login before adding your answer.

Traffic: 2375 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6