Determining The Source Of An Error In Calcbmr
1
1
Entering edit mode
10.6 years ago
DoubleD ▴ 130

If I am getting

ERROR: Bit::Vector::Interval_Fill(): maximum index out of range at /usr/local/share/perl/5.14.2/Genome/Model/Tools/Music/Bmr/CalcBmr.pm line 232, <gen4> line 301403.

Is there a way of telling which file is providing the "out of range" value? The reference, the MAF or the ROI file? Our BAMs were aligned using the g1k version37 reference. Our SNP calls were also done using this reference. I am now trying to use the GRCh37.73 reference, which from my reading, seemded to be similar except for the mitochondrial naming (MT calls were removed from the MAF anyhow) and the unlocalized, unplaced and alternate loci. If the BAMs and SNPs were processed with one reference, is there no way to change things downstream? (I think I know the answer, but I'm hoping...)

Thank you, DD

music error • 2.8k views
ADD COMMENT
0
Entering edit mode

Line 232 in CalcBmr.pm fails on line 301403 of your ROI file. Can you post a sample of your ROI file, or take a look at line 301403?

ADD REPLY
0
Entering edit mode

Lines 301402 - 301407

GL000223.1    173020    173144    AL603926.1
GL000223.1    179011    180456    AL603926.1
GL000223.1    42779    49127    ZNF84
GL000223.1    57229    57328    ZNF84
GL000223.1    57996    58126    ZNF84
GL000223.1    64595    64796    ZNF84

In the index, the relevant line is

GL000223.1    180455    3151408198    60    61

So the extent of the ROI in questions is on too large; perhaps it should be 180455 (or, since the indexed region is 180455 it should be 1 - 180456 ?) Making that change fixes the problem! I guess adding a flanking base pair to that ROI make it too big for the index.

Related question, if I get an error such as:

Use of uninitialized value within %gene_idx in array element at /usr/local/share/perl/5.14.2/Genome/Model/Tools/Music/Bmr/CalcBmr.pm line 299, <GEN6> line 2108.

Does <GEN6> refer to a certain input file? There were 5 in that case (genome music bmr calc-bmr --bam-list=/media/data/bamlist.csv --reference-sequence=/home/registry3/Documents/reference/human_g1k_v37.fasta --roi-file=/home/registry3/Documents/reference/human_g1k_v37_level1_gencode_ROI_for_music --output-dir=/media/data/music_output1/ --maf-file=/media/data/10patients.MAF)

ADD REPLY
1
Entering edit mode
10.6 years ago

Thanks for looking into it. So it seems that the addition of 2bp flanks to exons should also make sure that it doesn't go out of bounds of the FASTA sequences. So here is a slightly complicated, but safer script, to create an --roi-file for MuSiC:

perl -e '%chrEnd=map{chomp; split(/\t/)}`cut -f 1,2 Homo_sapiens.GRCh37.73.dna.primary_assembly.fa.fai`; map{($c,$s,$e,$g)=split(/\t/); $s--; $e+=2; $s=1 if($s<1); $e=$chrEnd{$c} if($chrEnd{$c} and $e>$chrEnd{$c}); print "$c\t$s\t$e\t$g"}`cat Homo_sapiens.GRCh37.73.merged_exons.bed`' > ensembl73_roi_file_for_music

Where the files Homo_sapiens.GRCh37.73.dna.primary_assembly.fa.fai and Homo_sapiens.GRCh37.73.merged_exons.bed can be generated as described in my answer to Best reference sequence for MuSiC and "bit_test" error.

ADD COMMENT

Login before adding your answer.

Traffic: 1949 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6