6.0 years ago by
Memorial Sloan Kettering, New York, USA
It's never as simple as "remove chr-prefix with a script". hg19 is UCSC's variant of the official GRCh37 assembly. Early releases of GRCh37 like GRCh37-lite, did not use chr-prefixes, but newer releases like GRCh37.p13 adopted the chr-prefix, and use a newer mitochondrial (
MT) sequence than hg19 does. Note also how
chrM in hg19 is named
MT in GRCh37. And all the unplaced contigs have very different names. So simply removing the chr-prefix in hg19 does not make it GRCh37. It makes it a wholly other chromosome naming convention, which is the last thing we need right now.
Update (Nov 4, 2016): Here is a UCSC Chain mapping UCSC's hg19 to Ensembl's GRCh37.p13 (no chr-prefix), compatible with tools like CrossMap, Remap, or liftOver. Notice how all chromosomes/contigs except
chrM only require renaming. Users of vcf2maf with hg19 VCFs as input, can pass this into the