Question: diferent contigs chrM
0
gravatar for cristina_sabiers
3.1 years ago by
Spain
cristina_sabiers60 wrote:

Hi!

I try to call snps to learn how to do it. I got myt bam file, I hadn't made it. I have the snp, indel report from this bam on excel file, thats because I thought the most easy way for me to learn call snps and see if I do right.

I get this error message (see above)

As I understand chrM has different length as my reference chrM (hg19). These people when aligned the file used another chrM reference than mine? or is another reason why I get this error?Is a way to fix it?

I follow this tutorial What Is The Best Pipeline For Human Whole Exome Sequencing?

Thanks!!! :)

java -jar /home/cri/Desktop/GATK/GenomeAnalysisTK.jar -T RealignerTargetCreator -R hg19.fa -I 010.bam  -o 010.intervals

##### ERROR MESSAGE: Input files reads and reference have incompatible contigs. Please see https://www.broadinstitute.org/gatk/guide/article?id=63for more information. Error details: Found contigs with the same name but different lengths or MD5s:
##### ERROR   contig  reads is named chrM with length 16569
##### ERROR   contig  reference is named chrM with length 16571 and MD5 d2ed829b8a1628d16cbeee88e88e39eb.
##### ERROR   reads contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY, chrM]
##### ERROR   reference contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY, chrM]
alignment • 1.7k views
ADD COMMENTlink modified 3.1 years ago by lh331k • written 3.1 years ago by cristina_sabiers60

If you are using your own reference files then why are you getting the length mismatch error? If you are not using your own copy of the hg19 that you used for the alignment then use that.

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by genomax70k

I dont use my own copy, I use this one instead because I have the results of snp calling already done, just for learn the steps and see if I do right. Isn't recommended?

Anyhow why this miss match in just chrM? Is any reason for that? (Just for know it)

Thanks again genomax!

ADD REPLYlink written 3.1 years ago by cristina_sabiers60

As you discovered you can't do that (since there may be subtle differences in the data, as you discovered here). In NGS analyses copy of the genome (and ALL allied files e.g. indexes, annotation) have to be identical from start to finish. If you got them from the same source (e.g. iGenomes) then you could use the same files that someone else did.

If you want to use some other copy of genome/indexes/data then you need to do the entire set of alignment steps again :-)

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by genomax70k

oh...ok! Thanks for explanation...Didn't think all of this is sooooo tricky O_o

then I try with my bam file :)

ADD REPLYlink written 3.1 years ago by cristina_sabiers60
4
gravatar for lh3
3.1 years ago by
lh331k
United States
lh331k wrote:

Most researchers who study human mitochondria use the revised cambridge reference sequence (rCRS). However, GRCh37 includes a different mitochondrial sequence. It is 2bp longer. For popgen analyses, 1000 Genomes project decided to use rCRS. Broad and a few other institutes adopted the 1000g reference as well. So, there are two common chrM, one with 16569bp (rCRS) and the other with 16571bp (GRCh37).

GRCh38 follows the convention and uses rCRS. There won't be such confusion for GRCh38.

ADD COMMENTlink written 3.1 years ago by lh331k

Thanks lh3 for the answer!

ADD REPLYlink written 3.1 years ago by cristina_sabiers60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 546 users visited in the last hour