Question: Change chromosome notation to match a new reference
gravatar for umn_bist
3.4 years ago by
umn_bist320 wrote:

I have a bam file that I would like sorted karyotypically (not lexicographically) but my contigs are not matching the reference file provided by GATK. Getting the reference file that was originally used for alignment and realigning my sample are unavailable options.

My reference uses "1,2,3,...,X,Y,MT" notation but my bam file uses "chr1,chr2,chr3,...chrX,chrY,chrM" notation. Is there a way to remove the chr prefix and change chrM to MT in my bam file? Can I get by with just revising the header only without messing with the reads in the bam file? Thank you for your help!

rna-seq • 3.1k views
ADD COMMENTlink modified 3.4 years ago by Chris Miller20k • written 3.4 years ago by umn_bist320
gravatar for Chris Miller
3.4 years ago by
Chris Miller20k
Washington University in St. Louis, MO
Chris Miller20k wrote:

If all your bams are this way, it's probably easier to change your reference to match the bams. (That's just changing a few lines of a fasta and rebuilding an index or two).

If you do end up needing to reformat your bams, there is good advice in previous threads:
Human dna reference file with no prefix 'chr'
Bam File: Change Chromosome Notation 

ADD COMMENTlink written 3.4 years ago by Chris Miller20k

I had considered this but for some reason the GATK forum moderator did not recommend this option. If I were to change the reference instead, how would I ensure that its corresponding SNP.vcf will work properly. Can I just leave the snp.vcf alone?

Also is there a quick fix to this error as well - "Discordant contig lengths: read MT LN=16571, ref MT LN=16569". My rna-seq was aligned against Ensembl so the GATK reference is giving me a hard time.


ADD REPLYlink written 3.4 years ago by umn_bist320
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1057 users visited in the last hour