Question: GATK HC mis-reading chromosomes
gravatar for Ace
8 months ago by
Ace70 wrote:

My command:

$GATK HaplotypeCaller -R $ref -L $locations -I Bamfiles.txt -ploidy 1

The user error looks like this:

A USER ERROR has occurred: Input files reference and reads have incompatible contigs: No overlapping contigs found. reference contigs = [All of the correct contigs are listed out] reads contigs = []

Meaning, the reads contigs are being read-in as blank

I thought maybe one of my files was aligned incorrectly. I looked at their headers

for file in $(cat Bamfiles.txt); do samtools view -H $file; done | sort | uniq -c > Out.txt

The counts for each chromosomal header were consistent and the same numbers as I have samples. The only thing that looked different: about 1/4 of my samples (from an earlier study) had VN (sam version): 1.3 while the rest had VN: 1.5. Could this be enough to cause GATK to reject my samples and if so is there an easy way to convert between samfile 1.3 and 1.5 or should I re-align them?

ADD COMMENTlink modified 8 months ago • written 8 months ago by Ace70

Hello Ace!

We believe that this post does not fit the main topic of this site.

Error resolved

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.


ADD REPLYlink written 8 months ago by Ace70

No need to close the post. That is an action used by moderators for inappropriate, incomplete, off-topic posts.

You can accept your own answer below to provide closure to this thread.

ADD REPLYlink written 8 months ago by genomax76k
gravatar for Ace
8 months ago by
Ace70 wrote:

Update: I've fixed the error, and boy do I feel silly. I totally forgot how essential it is to have your bamfile list end in ".list". Poor GATK was reading my list as a bam file; no wonder it couldn't find any chromosomes! Closing the question now, thanks for your help!

ADD COMMENTlink written 8 months ago by Ace70
gravatar for Pierre Lindenbaum
8 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum125k wrote:

the sequence dictionaries (lines in the bam headers starting with @SQ and the lines in the .dict file) are not the same.

and looking at reads contigs = [] , i wonder if there is any dict in your bam files ?

ADD COMMENTlink modified 8 months ago • written 8 months ago by Pierre Lindenbaum125k

Right, so when I looked at this initially my first step was checking that the headers in the bam file were right. Indeed, each line looks something like this @SQ SN:$chromosome LN:$length and the chromosomes correspond correctly to the chromosome in the reference fasta.

I thought it might be an error with a missing index, so I tried re-indexing all of them with

 for file in $(cat $Bamfiles); do echo $file; samtools index $file; done

Unfortunately, this alone did not fix the error. Is there a way to check their content?

ADD REPLYlink modified 8 months ago by genomax76k • written 8 months ago by Ace70
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1806 users visited in the last hour