Hi, I'm trying to run novoBreak with tumor and normal bam file.
Tumor files are 'chr19.tumor.bam' and 'chr19.normal.bam' from ICGC-TCGA DREAM Mutation Calling challenge
(URL: https://www.synapse.org/#!Synapse:syn2335184)
Reference file is 'Homo_sapiens.GRCh37.dna.chromosome.19.fa'
(URL: ftp://ftp.ensembl.org/pub/grch37/current/fasta/homo_sapiens/dna/)
And last error message is this
----------
[main_samview] region "chrY:13119970-13120970" specifies an unknown reference name. Continue anyway.
[main_samview] region "chrY:13119970-13120970" specifies an unknown reference name. Continue anyway.
[main_samview] region "chrX:40529224-40530224" specifies an unknown reference name. Continue anyway.
[main_samview] region "chr11:51592691-51593691" specifies an unknown reference name. Continue anyway.
[main_samview] region "chrX:40529224-40530224" specifies an unknown reference name. Continue anyway.
[main_samview] region "chr11:51592691-51593691" specifies an unknown reference name. Continue anyway.
[main_samview] region "chr19:54035860-54036860" specifies an unknown reference name. Continue anyway.
[main_samview] region "chr19:54035860-54036860" specifies an unknown reference name. Continue anyway
[main_samview] region "chrY:59013259-59014259" specifies an unknown reference name. Continue anyway.
[main_samview] region "chrY:59013259-59014259" specifies an unknown reference name. Continue anyway.
[main_samview] region "chrX:61728197-61729197" specifies an unknown reference name. Continue anyway.
[main_samview] region "chr11:48866923-48867923" specifies an unknown reference name. Continue anyway.
[main_samview] region "chr8:27951646-27952646" specifies an unknown reference name. Continue anyway.
[main_samview] region "chr11:48866923-48867923" specifies an unknown reference name. Continue anyway.
[main_samview] region "chrY:10044249-10045249" specifies an unknown reference name. Continue anyway.
[main_samview] region "chr8:27951646-27952646" specifies an unknown reference name. Continue anyway.
[main_samview] region "chrY:10044249-10045249" specifies an unknown reference name. Continue anyway.
[main_samview] region "chr2:132968088-132969088" specifies an unknown reference name. Continue anyway.
[main_samview] region "chr2:132968088-132969088" specifies an unknown reference name. Continue anyway.
----------
I thought the problem is mismatch between reference and bam file. So I converted bam file to fastq, and converted it with my reference to sam file, finally converted it to bam file. But there was an error message similar to the previous one..
I don't know what is wrong. Could you give me some advice?
Thank you.
The reference fasta uses only the numbers to name the chromosomes, so 1,2,3 instead of chr1, chr2, chr3. You can change that in awk, as previously suggested::
Thank you for reply! I tried your command, but there are similar problem like above error message. But refer to your answers, I got other reference file that no have 'chr' pre-fix. Your reply was very helpful for me. I will try it. Thank you