Question: GATK variant calling on multiple bam files
0
gravatar for User000
4 weeks ago by
User000410
User000410 wrote:

Hello,

I am trying to do variant calling using GATK on a list of several hundreds of bam files like this:

bob.dup.bam
smith.dup.bam
will.dup.bam

This is the command line:

java -jar /Tools/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar HaplotypeCaller -L chr1:1-2096 -R ref.fasta -I bam/bam_list --min-base-quality-score 20 -O output.out

I get this error:

A USER ERROR has occurred: Input files reference and reads have incompatible contigs: No overlapping contigs found.
  reference contigs = [chr1, chr2, chr3, chr4, chrUn]
  reads contigs = []

However, if I run for a single bam file separately it works.

java -jar /Tools/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar HaplotypeCaller -L chr1:1-2096 -R ref.fasta -I bam/bob.dup.bam --min-base-quality-score 20 -O output.out

What is the problem and how to solve this?

bam gatk • 147 views
ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by User000410

At least one of the BAMs has incompatible contigs in its header. Look through the headers of each BAM to find the incompatible one(s).

ADD REPLYlink written 4 weeks ago by _r_am31k

I checked the header, it is ok, also because when I run all of them a as single bam files all of them work (for now I am working with 5 files to check). I suspect it is not recognizing the list of bam files?

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by User000410
1

You're right, -I doesn't seem to take a file, just a string of BAM names. Try either a comma separated list of names or a glob (like *.bam).

ADD REPLYlink written 4 weeks ago by _r_am31k
1

I'm not sure the latest version of HC is able to read a '*.list' suffix (this would be a bug)

can you please try

(..) `awk '{printf(" -I %s ",$0);}'  bam/bam_list  ` (...)

instead of

(..) -I bam/bam_list  (...)
ADD REPLYlink written 4 weeks ago by Pierre Lindenbaum131k
1

Oh... I realised what was wrong after your answer Pierre.... my bam list file name was bam_list (which works just fine for freebayes and bcftools), but GATK wants bam.list... Now it seems working... a very stupid mistake and all my pipeline was not working for days... :)

ADD REPLYlink written 4 weeks ago by User000410
2

All it took was changing the file name from bam_list to bam.list?

ADD REPLYlink written 4 weeks ago by _r_am31k

yeah.... I am not sure this particularity is specified in the GATK documentation... or may be I was not careful enough..

ADD REPLYlink written 4 weeks ago by User000410
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2251 users visited in the last hour