Question: problems in hg19 and b37 compatibility
1
gravatar for Nicola Casiraghi
4.8 years ago by
Germany, Heidelberg, DKFZ EMBL
Nicola Casiraghi440 wrote:

Hi everybody,

A bam file has been aligned using hg19 reference genome. Thus, the chromosome notation is [chrM, chr1, chr2, chr3, chr4, ..., chrX,chrY].

I want to look for PMs using MuTect that requires in input vcf files from dbSNP and COSMIC. In these vcf files the chromosome notation is [1,2,3,4,...,MT,X,Y] according to b37 notation.

a) As expected running MuTect on these files, it returns the error:

##### ERROR MESSAGE: Input files dbsnp and reference have incompatible contigs: No overlapping contigs found.

b) I succesfully reheader bam file in order to remove 'chr' and changing chrM to MT, but the error now is:

##### ERROR MESSAGE: Input files reads and reference have incompatible contigs: Found contigs with the same name but different lengths:
##### ERROR   contig reads = MT / 16571
##### ERROR   contig reference = MT / 16569.

c) The last attempt was to add 'chr' and replace chrM to dbSNP and COSMIC vcf files required by MuTect. Error is:

##### ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file could not be determined dynamically.

How can I run MuTect on these bam and vcf files?  

many thanks

b37 hg19 annotation • 7.5k views
ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by Nicola Casiraghi440
2

Allegedly there's an hg19 to b37 liftOver file provided by GATK in the resource bundle. If nothing else, just use that and call it done (I imagine that'll take a while to run).

ADD REPLYlink written 4.8 years ago by Devon Ryan88k

Devon, you should put this as the answer so Nicolas can accept it.

ADD REPLYlink written 4.8 years ago by Chris Fields2.1k
2
gravatar for Chris Fields
4.8 years ago by
Chris Fields2.1k
University of Illinois Urbana-Champaign
Chris Fields2.1k wrote:

The two build are very similar but not exactly the same; IIRC this is a known problem with hg19/b37 (note the difference in chrM size mentioned in the error message).  See these links:

Note that the second link mentions the liftover data needed for a hg19->b37 conversion, as @Devon pointed out.  

ADD COMMENTlink written 4.8 years ago by Chris Fields2.1k
0
gravatar for chan-song
4.8 years ago by
chan-song0
United States
chan-song0 wrote:

You may use the following steps:

  1. Convert the b37 chromosome notation in those vcf files into the hg19 chromosome notation
  2. Use igvtools (http://www.broadinstitute.org/software/igv/download) to re-create index files for those vcf files
ADD COMMENTlink written 4.8 years ago by chan-song0
0
gravatar for Nicola Casiraghi
4.8 years ago by
Germany, Heidelberg, DKFZ EMBL
Nicola Casiraghi440 wrote:

Hi everybody, many thanks for all your helps and suggestions.

@Devon, I found in ftp://gsapubftp-anonymous@ftp.broadinstitute.org/Liftover_Chain_Files and here files that should be able to convert b37toh19 and viceversa. Unfortunately, and I do not know why, it doesn't work in my case. I'm working on it to fix the problem.

@Chris, thank you for the links, the first one is exactly my problem.

 

ADD COMMENTlink written 4.8 years ago by Nicola Casiraghi440

When you say that it doesn't work what exactly do you mean? Do you get an error message? If so, what is it?

ADD REPLYlink written 4.8 years ago by Devon Ryan88k

Were you able to solve the problem?

ADD REPLYlink written 2.9 years ago by rse70

We ran into this issue a few years ago, and the quickest way we found to solve it was to rerun the analysis with the 'correct' chrM, as suggested in the second link.

ADD REPLYlink written 2.9 years ago by Chris Fields2.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1183 users visited in the last hour