Question: GATK HC mis-reading chromosomes
gravatar for Ace
17 days ago by
Ace20 wrote:

My command:

$GATK HaplotypeCaller -R $ref -L $locations -I Bamfiles.txt -ploidy 1

The user error looks like this:

A USER ERROR has occurred: Input files reference and reads have incompatible contigs: No overlapping contigs found. reference contigs = [All of the correct contigs are listed out] reads contigs = []

Meaning, the reads contigs are being read-in as blank

I thought maybe one of my files was aligned incorrectly. I looked at their headers

for file in $(cat Bamfiles.txt); do samtools view -H $file; done | sort | uniq -c > Out.txt

The counts for each chromosomal header were consistent and the same numbers as I have samples. The only thing that looked different: about 1/4 of my samples (from an earlier study) had VN (sam version): 1.3 while the rest had VN: 1.5. Could this be enough to cause GATK to reject my samples and if so is there an easy way to convert between samfile 1.3 and 1.5 or should I re-align them?

ADD COMMENTlink modified 17 days ago • written 17 days ago by Ace20

Hello Ace!

We believe that this post does not fit the main topic of this site.

Error resolved

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.


ADD REPLYlink written 17 days ago by Ace20

No need to close the post. That is an action used by moderators for inappropriate, incomplete, off-topic posts.

You can accept your own answer below to provide closure to this thread.

ADD REPLYlink written 17 days ago by genomax67k
gravatar for Ace
17 days ago by
Ace20 wrote:

Update: I've fixed the error, and boy do I feel silly. I totally forgot how essential it is to have your bamfile list end in ".list". Poor GATK was reading my list as a bam file; no wonder it couldn't find any chromosomes! Closing the question now, thanks for your help!

ADD COMMENTlink written 17 days ago by Ace20
gravatar for Pierre Lindenbaum
17 days ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:

the sequence dictionaries (lines in the bam headers starting with @SQ and the lines in the .dict file) are not the same.

and looking at reads contigs = [] , i wonder if there is any dict in your bam files ?

ADD COMMENTlink modified 17 days ago • written 17 days ago by Pierre Lindenbaum120k

Right, so when I looked at this initially my first step was checking that the headers in the bam file were right. Indeed, each line looks something like this @SQ SN:$chromosome LN:$length and the chromosomes correspond correctly to the chromosome in the reference fasta.

I thought it might be an error with a missing index, so I tried re-indexing all of them with

 for file in $(cat $Bamfiles); do echo $file; samtools index $file; done

Unfortunately, this alone did not fix the error. Is there a way to check their content?

ADD REPLYlink modified 17 days ago by genomax67k • written 17 days ago by Ace20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 814 users visited in the last hour