GATK HaplotypeCaller with interval list
1
0
Entering edit mode
2.2 years ago
Mengyao • 0

I am trying to use the -L option of GATK HaplotypeCaller to call SNPs and short InDels with in an interval list. My interval list file (top8snp.interval_list) content is as follows:

12 33029845 33030845 + rs24767598

13 40586682 40587682 + rs24748362

18 24373857 24374857 + rs8856159

21 50381146 50382146 + rs8905059

31 24929114 24930114 + rs23686011

33 18169455 18170455 + rs9197329

35 14265916 14266916 + rs23883139

38 6688818 6689818 + rs24017102

The command I used to call variants is: java -Xmx4g -XX:ParallelGCThreads=1 -XX:ConcGCThreads=1 -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /mnt/software/kit/unicall.kit/gatk-local.jar HaplotypeCaller --native-pair-hmm-threads 1 -R /home/mzhao/data/share/dog_ref/genome/Canis_lupus_familiaris-GCA_014441545.1-unmasked.fa -I IG001_1blood_healthy.dedup.bam -O IG001_1blood_healthy.top8snp.vcf -L ../top8snp.interval_list 2>IG001_1blood_healthyHaplotypeCaller.raw.top8snp.log &

This gives me no SNP or InDel call.

As I know there are SNPs in these regions, I checked the log file. It has the following warnings:

WARNING 2022-01-28 15:47:14 IntervalListCodec Ignoring interval for unknown reference: 12:33029845-33030845 + rs24767598

WARNING 2022-01-28 15:47:14 IntervalListCodec Ignoring interval for unknown reference: 13:40586682-40587682 + rs24748362

WARNING 2022-01-28 15:47:14 IntervalListCodec Ignoring interval for unknown reference: 18:24373857-24374857 + rs8856159

WARNING 2022-01-28 15:47:14 IntervalListCodec Ignoring interval for unknown reference: 21:50381146-50382146 + rs8905059

WARNING 2022-01-28 15:47:14 IntervalListCodec Ignoring interval for unknown reference: 31:24929114-24930114 + rs23686011

WARNING 2022-01-28 15:47:14 IntervalListCodec Ignoring interval for unknown reference: 33:18169455-18170455 + rs9197329

WARNING 2022-01-28 15:47:14 IntervalListCodec Ignoring interval for unknown reference: 35:14265916-14266916 + rs23883139

WARNING 2022-01-28 15:47:14 IntervalListCodec Ignoring interval for unknown reference: 38:6688818-6689818 + rs24017102

May I ask how should I fix this problem?

Many thanks.

HaplotypeCaller GATK • 1.4k views
ADD COMMENT
1
Entering edit mode
2.2 years ago

an .interval_list MUST start with SAM header + a dictionary. https://gatk.broadinstitute.org/hc/en-us/articles/360035531852-Intervals-and-interval-lists

Picard-style interval files have a SAM-like header that includes a sequence dictionary.

ADD COMMENT

Login before adding your answer.

Traffic: 2363 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6