Problem viewing BAM files in IGV - samtools sort error?
1
0
Entering edit mode
8.8 years ago

Hi all,

I have a BAM file which doesn't display when I load it into IGV. From searching, I gather the most common reason for this error is that the chromosome names in the BAM file and genome file don't match. I have checked, and they do match, but there might be an indexing problem. When I view the header for my sorted BAM file, the SQ lines appear in the order scaffold_1, scaffold_10, scaffold_11, scaffold_100, scaffold_2, ... (i.e. like Excel would sort these), not scaffold_1, scaffold_2, ... scaffold_10, scaffold_11, ..., scaffold_100 (i.e. sorted in true numerical order).

My question is: could this sort order then give rise to an indexing problem that explains why IGV can't load the BAM file properly?

I think this might be the case, because a colleague has given me another BAM file (same reference genome), where the scaffolds appear in correct numerical order, and that one displays OK in IGV.

And how do I fix this if so? There don't seem to be too many options for samtools sort. There are no extra characters (leading 00's, etc.) in the scaffold names in either of the BAM files, or the .fai file.

Thanks!

alignment • 5.8k views
ADD COMMENT
0
Entering edit mode

I went to the directory with ref.fa in it and did:

samtools faidx ref.fa

When I view ref.fa.fai the chromosomes appear in numerical order.

I used the same reference genome FASTA file to align the reads to in gsnap. By the alignment index, do you mean the .bai file?

ADD REPLY
0
Entering edit mode

I have no experience with gsnap, but you could possibly do the following: get the order of the chromosomes as they appear in the BAM file (look at the BAM header), then re-arrange the reference FASTA file according to this order, with this re-arranged reference FASTA file try creating the index and .genome file again.

ADD REPLY
0
Entering edit mode

Does IGV complain (any warnings or errors) when loading the file? Or are you inferring it did not load properly because you can't see anything? Maybe you can try the suggestion from I loaded a BAM (RNA-seq) file into IGV but cant see anything! thread.

ADD REPLY
0
Entering edit mode

No complaints or warnings - inferring from the fact I can't see anything. Yes, I saw that thread and tried going to a specific location, but still couldn't see anything. It was that thread that gave me the idea there might be something wrong with the index.

From your previous post, that gave me the idea of trying to rebuild the genome database for gsnap with a different sort order (there are several options, though it's unclear what the default is). That's running now... I'll see if it works.

ADD REPLY
1
Entering edit mode

OK... it seems to be working now. Thanks!

ADD REPLY
0
Entering edit mode
8.8 years ago
James Ashmore ★ 3.4k

Could you tell me the steps you took to create the .genome file for IGV? I think the order the chromosomes in the reference genome appear is the same order reads aligned to each chromosome appear in the BAM file. You have to make sure you create the .genome file from the same reference genome FASTA file used to create the alignment index.

ADD COMMENT

Login before adding your answer.

Traffic: 2905 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6