Do I have to realign from scratch?
1
0
Entering edit mode
7.0 years ago
Kasthuri ▴ 300

I am working with human genomes I downloaded from TCGA (all bam files). I wanted to run GATK DepthOfCoverage which complained that the contigs are not compatible. I think GATK recommends aligning from scratch. Is there an easier way around than aligning from scratch? I tried Picard's ReorderSam but it doesn't work as well. Is there any way (script) to just change the contigs info in a .bam file. Here is the error message- Thanks!!

##### ERROR MESSAGE: Input files reads and reference have incompatible contigs: No overlapping contigs found.
##### ERROR   reads contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y, MT, GL000207.1, GL000226.1, GL000229.1, GL000231.1, GL000210.1, GL000239.1, GL000235.1, GL000201.1, GL000247.1, GL000245.1, GL000197.1, GL000203.1, GL000246.1, GL000249.1, GL000196.1, GL000248.1, GL000244.1, GL000238.1, GL000202.1, GL000234.1, GL000232.1, GL000206.1, GL000240.1, GL000236.1, GL000241.1, GL000243.1, GL000242.1, GL000230.1, GL000237.1, GL000233.1, GL000204.1, GL000198.1, GL000208.1, GL000191.1, GL000227.1, GL000228.1, GL000214.1, GL000221.1, GL000209.1, GL000218.1, GL000220.1, GL000213.1, GL000211.1, GL000199.1, GL000217.1, GL000216.1, GL000215.1, GL000205.1, GL000219.1, GL000224.1, GL000223.1, GL000195.1, GL000212.1, GL000222.1, GL000200.1, GL000193.1, GL000194.1, GL000225.1, GL000192.1, NC_007605, hs37d5]
##### ERROR   reference contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY, chrM]
GATK incompatible contigs • 1.9k views
ADD COMMENT
3
Entering edit mode

The best would be to get the correct reference genome for GATK, the one which was used for the alignment.

ADD REPLY
1
Entering edit mode

faster: get the reference fasta that was used to map the reads.

ADD REPLY
3
Entering edit mode
7.0 years ago
venu 7.1k

It seems the reference genome you are using for this task and the reference genome used for aligning the raw data is different (hg19 & hs37d5 respectively, if my guess is correct). You can add chr prefix to your BAM chromosomes but I doubt GATK will complain again about those extra contigs. Try with hs37d5 genome, I think it won't complain.

ADD COMMENT
1
Entering edit mode

Moved to answer and marked accepted.

ADD REPLY
0
Entering edit mode

Yes, it is indeed hs37d5. Thank you so much, everyone!

ADD REPLY

Login before adding your answer.

Traffic: 2016 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6