GATK Mutect2 Input files reference and features have incompatible contigs: No overlapping contigs found.
0
1
Entering edit mode
9 months ago

Hi,

I am following the GATK best practices pipeline for variant calling starting from targeted sequencing bam and bai files using the hg19 reference. When applying GATK Mutect2 got the following error

 A USER ERROR has occurred: Input files reference and features have incompatible contigs: No 
  overlapping contigs found.
  reference contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chrX, chr8, chr9, chr10, chr11, chr12, chr13, 
  chr14, chr15, chr16, chr17, chr18, chr20, chrY, chr19, chr22, chr21, chr6_ssto_hap7, chr6_mcf_hap5, 
  chr6_cox_hap2, chr6_mann_hap4, chr6_apd_hap1, chr6_qbl_hap6, chr6_dbb_hap3, chr17_ctg5_hap1, 
  chr4_ctg9_hap1, chr1_gl000192_random, chrUn_gl000225, chr4_gl000194_random, 
  chr4_gl000193_random, chr9_gl000200_random, chrUn_gl000222, chrUn_gl000212, 
  chr7_gl000195_random, chrUn_gl000223, chrUn_gl000224, chrUn_gl000219, chr17_gl000205_random, 
  chrUn_gl000215, chrUn_gl000216, chrUn_gl000217, chr9_gl000199_random, chrUn_gl000211, 
  chrUn_gl000213, chrUn_gl000220, chrUn_gl000218, chr19_gl000209_random, chrUn_gl000221, 
  chrUn_gl000214, chrUn_gl000228, chrUn_gl000227, chr1_gl000191_random, chr19_gl000208_random, 
  chr9_gl000198_random, chr17_gl000204_random, chrUn_gl000233, chrUn_gl000237, chrUn_gl000230, 
  chrUn_gl000242, chrUn_gl000243, chrUn_gl000241, chrUn_gl000236, chrUn_gl000240, 
  chr17_gl000206_random, chrUn_gl000232, chrUn_gl000234, chr11_gl000202_random, chrUn_gl000238, 
 chrUn_gl000244, chrUn_gl000248, chr8_gl000196_random, chrUn_gl000249, chrUn_gl000246, 
  chr17_gl000203_random, chr8_gl000197_random, chrUn_gl000245, chrUn_gl000247, 
   chr9_gl000201_random, chrUn_gl000235, chrUn_gl000239, chr21_gl000210_random, chrUn_gl000231, 
  chrUn_gl000229, chrM, chrUn_gl000226, chr18_gl000207_random]
  features contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y, MT, 
  GL000207.1, GL000226.1, GL000229.1, GL000231.1, GL000210.1, GL000239.1, GL000235.1, 
  GL000201.1, GL000247.1, GL000245.1, GL000197.1, GL000203.1, GL000246.1, GL000249.1, 
  GL000196.1, GL000248.1, GL000244.1, GL000238.1, GL000202.1, GL000234.1, GL000232.1,  
  GL000206.1, GL000240.1, GL000236.1, GL000241.1, GL000243.1, GL000242.1, GL000230.1, 
  GL000237.1, GL000233.1, GL000204.1, GL000198.1, GL000208.1, GL000191.1, GL000227.1, 
  GL000228.1, GL000214.1, GL000221.1, GL000209.1, GL000218.1, GL000220.1, GL000213.1, 
   GL000211.1, GL000199.1, GL000217.1, GL000216.1, GL000215.1, GL000205.1, GL000219.1, 
   GL000224.1, GL000223.1, GL000195.1, GL000212.1, GL000222.1, GL000200.1, GL000193.1, 
    GL000194.1, GL000225.1, GL000192.1, NC_007605]

And this is the code I have:

export GENOME="/PATH/Manuel/FILES/HUMAN_REFERENCES/hg19.fa"
export GERM="/PATH/Manuel/FILES/HUMAN_REFERENCES/af-only-gnomad.raw.sites.vcf"
export PON="/PATH/Manuel/FILES/HUMAN_REFERENCES/Mutect2-WGS-panel-b37.vcf"
export VCF="${RECALIBRATED%.bam}.vcf"
srun /mnt/beegfs/apptainer/images/gatk4.sif gatk Mutect2 \
  -R $GENOME \
  -I $RECALIBRATED \
  --germline-resource $GERM \
  --panel-of-normals $PON \
  -O $VCF

Is there any better PON or GERM that I can use? And if I have to make the names the same from reference contigs and feature contigs how can I do that?

Best Regards,
Manuel

hg19 Mutect2 GATK Variant-Calling • 894 views
ADD COMMENT
0
Entering edit mode

classical problem . You're using two different reference dictionaries : https://www.google.com/search?q=%22No+overlapping+contigs+found.%22+site%3Abiostars.org

ADD REPLY
0
Entering edit mode

I am sorry, which reference dictionaries are you referring to? The sample reference file has a .dict and a .fai file which are required this step (one generated with picard and other with samtools faidx)

ADD REPLY
0
Entering edit mode

Dictionaries define the chromosomes , their names , sizes , ordre. Here you have a mix of "1" and "chr1"

ADD REPLY
0
Entering edit mode

Your reference, indexes, aligned data have to refer to the same set of data. You can't mix and match.

ADD REPLY
0
Entering edit mode

Thank you for your responses. I do not have a matched normal for this patient thereby I am using "Mutect2-WGS-panel-b37.vcf" that is panel of normals associated with the reference hg19 and the germline is a list of known germline alterations also associated with hg19. They should match.. How can I change the names of reference contigs and feature contigs so that both are chr1 or 1, and the list size is the same removing some reference contigs?

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Thank you. That should make the names the same and regarding the size of the two lists?

ADD REPLY

Login before adding your answer.

Traffic: 1824 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6