Question: Error BAM file(s) do not have the contig: hs37d5?
0
gravatar for vctrm67
12 months ago by
vctrm6710
vctrm6710 wrote:

I am running GATK Mutect using the b37 reference genome on some BAM files. However, I keep getting this error:

BAM file(s) do not have the contig: hs37d5. You are probably using a different reference than the one this file was aligned with

I double checked that the BAM files were created using b37 by looking at the header of the BAM files, so I'm not sure what's quite wrong:

@HD VN:1.4  GO:none SO:coordinate 
@SQ SN:1    LN:249250621
@SQ SN:2    LN:243199373
@SQ SN:3    LN:198022430
@SQ SN:4    LN:191154276
@SQ SN:5    LN:180915260
@SQ SN:6    LN:171115067
@SQ SN:7    LN:159138663
@SQ SN:8    LN:146364022
@SQ SN:9    LN:141213431
@SQ SN:10   LN:135534747
@SQ SN:11   LN:135006516
@SQ SN:12   LN:133851895
@SQ SN:13   LN:115169878
@SQ SN:14   LN:107349540
@SQ SN:15   LN:102531392
@SQ SN:16   LN:90354753
@SQ SN:17   LN:81195210
@SQ SN:18   LN:78077248
@SQ SN:19   LN:59128983
@SQ SN:20   LN:63025520
@SQ SN:21   LN:48129895
@SQ SN:22   LN:51304566
@SQ SN:X    LN:155270560
@SQ SN:Y    LN:59373566
@SQ SN:MT   LN:16569
@SQ SN:GL000207.1   LN:4262
@SQ SN:GL000226.1   LN:15008
@SQ SN:GL000229.1   LN:19913
@SQ SN:GL000231.1   LN:27386
@SQ SN:GL000210.1   LN:27682
@SQ SN:GL000239.1   LN:33824
@SQ SN:GL000235.1   LN:34474
@SQ SN:GL000201.1   LN:36148
@SQ SN:GL000247.1   LN:36422
@SQ SN:GL000245.1   LN:36651
@SQ SN:GL000197.1   LN:37175
@SQ SN:GL000203.1   LN:37498
@SQ SN:GL000246.1   LN:38154
@SQ SN:GL000249.1   LN:38502
@SQ SN:GL000196.1   LN:38914
@SQ SN:GL000248.1   LN:39786
@SQ SN:GL000244.1   LN:39929
@SQ SN:GL000238.1   LN:39939
@SQ SN:GL000202.1   LN:40103
@SQ SN:GL000234.1   LN:40531
@SQ SN:GL000232.1   LN:40652
@SQ SN:GL000206.1   LN:41001
@SQ SN:GL000240.1   LN:41933
@SQ SN:GL000236.1   LN:41934
@SQ SN:GL000241.1   LN:42152
@SQ SN:GL000243.1   LN:43341
@SQ SN:GL000242.1   LN:43523
@SQ SN:GL000230.1   LN:43691
@SQ SN:GL000237.1   LN:45867
@SQ SN:GL000233.1   LN:45941
@SQ SN:GL000204.1   LN:81310
@SQ SN:GL000198.1   LN:90085
@SQ SN:GL000208.1   LN:92689
@SQ SN:GL000191.1   LN:106433
@SQ SN:GL000227.1   LN:128374
@SQ SN:GL000228.1   LN:129120
@SQ SN:GL000214.1   LN:137718
@SQ SN:GL000221.1   LN:155397
@SQ SN:GL000209.1   LN:159169
@SQ SN:GL000218.1   LN:161147
@SQ SN:GL000220.1   LN:161802
@SQ SN:GL000213.1   LN:164239
@SQ SN:GL000211.1   LN:166566
@SQ SN:GL000199.1   LN:169874
@SQ SN:GL000217.1   LN:172149
@SQ SN:GL000216.1   LN:172294
@SQ SN:GL000215.1   LN:172545
@SQ SN:GL000205.1   LN:174588
@SQ SN:GL000219.1   LN:179198
@SQ SN:GL000224.1   LN:179693
@SQ SN:GL000223.1   LN:180455
@SQ SN:GL000195.1   LN:182896
@SQ SN:GL000212.1   LN:186858
@SQ SN:GL000222.1   LN:186861
@SQ SN:GL000200.1   LN:187035
@SQ SN:GL000193.1   LN:189789
@SQ SN:GL000194.1   LN:191469
@SQ SN:GL000225.1   LN:211173
@SQ SN:GL000192.1   LN:547496
@SQ SN:NC_007605    LN:171823

Does anyone know why I am getting this error?

software error • 430 views
ADD COMMENTlink modified 12 months ago by Pierre Lindenbaum130k • written 12 months ago by vctrm6710
0
gravatar for Pierre Lindenbaum
12 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum130k wrote:

I double checked that the BAM files were created using b37

yes, it's human b37 but it's not the same reference sequence. your reference sequence contains hs37d5 (decoy file) https://www.cureffi.org/2013/02/01/the-decoy-genome/

ADD COMMENTlink written 12 months ago by Pierre Lindenbaum130k

I see. Is there a place where I can download the non-decoy b37 reference?

ADD REPLYlink written 12 months ago by vctrm6710

It should be pretty straightforward to find online. Search for "hs37d5" and EMBL/NCBI should have the file.

ADD REPLYlink written 12 months ago by RamRS30k

Won't I get a file that includes hs37d5? Shouldn't I get a file that excludes hs37d5 so I don't get the same error?

ADD REPLYlink written 12 months ago by vctrm6710

Sorry, I misread the error. You can search for b37 - you'll find one of Heng Li's blog posts that will take you to a link on NCBI/EMBL/GENCODE. b37 should be easier to find than hs37d5.

ADD REPLYlink written 12 months ago by RamRS30k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1762 users visited in the last hour