Question: Problem viewing BAM files in IGV - samtools sort error?
0
gravatar for harriet.hunt897
3.3 years ago by
United Kingdom
harriet.hunt89710 wrote:

Hi all,

I have a BAM file which doesn't display when I load it into IGV. From searching, I gather the most common reason for this error is that the chromosome names in the BAM file and genome file don't match. I have checked, and they do match, but there might be an indexing problem. When I view the header for my sorted BAM file, the SQ lines appear in the order scaffold_1, scaffold_10, scaffold_11, scaffold_100, scaffold_2, ... (i.e. like Excel would sort these), not scaffold_1, scaffold_2, ... scaffold_10, scaffold_11, ..., scaffold_100 (i.e. sorted in true numerical order).

My question is: could this sort order then give rise to an indexing problem that explains why IGV can't load the BAM file properly?

I think this might be the case, because a colleague has given me another BAM file (same reference genome), where the scaffolds appear in correct numerical order, and that one displays OK in IGV.

And how do I fix this if so? There don't seem to be too many options for samtools sort. There are no extra characters (leading 00's, etc.) in the scaffold names in either of the BAM files, or the .fai file.

Thanks!

alignment forum • 2.2k views
ADD COMMENTlink modified 3.3 years ago by PoGibas4.7k • written 3.3 years ago by harriet.hunt89710

I went to the directory with ref.fa in it and did:

samtools faidx ref.fa

When I view ref.fa.fai the chromosomes appear in numerical order.

I used the same reference genome FASTA file to align the reads to in gsnap. By the alignment index, do you mean the .bai file?

ADD REPLYlink written 3.3 years ago by harriet.hunt89710

I have no experience with gsnap, but you could possibly do the following: get the order of the chromosomes as they appear in the BAM file (look at the BAM header), then re-arrange the reference FASTA file according to this order, with this re-arranged reference FASTA file try creating the index and .genome file again.

ADD REPLYlink written 3.3 years ago by James Ashmore2.5k

Does IGV complain (any warnings or errors) when loading the file? Or are you inferring it did not load properly because you can't see anything? Maybe you can try the suggestion from I loaded a BAM (RNA-seq) file into IGV but cant see anything! thread.

ADD REPLYlink written 3.3 years ago by h.mon20k

No complaints or warnings - inferring from the fact I can't see anything. Yes, I saw that thread and tried going to a specific location, but still couldn't see anything. It was that thread that gave me the idea there might be something wrong with the index.

From your previous post, that gave me the idea of trying to rebuild the genome database for gsnap with a different sort order (there are several options, though it's unclear what the default is). That's running now... I'll see if it works.

ADD REPLYlink written 3.3 years ago by harriet.hunt89710
1

OK... it seems to be working now. Thanks!

ADD REPLYlink written 3.3 years ago by harriet.hunt89710
0
gravatar for James Ashmore
3.3 years ago by
James Ashmore2.5k
UK/Edinburgh/MRC Centre for Regenerative Medicine
James Ashmore2.5k wrote:

Could you tell me the steps you took to create the .genome file for IGV? I think the order the chromosomes in the reference genome appear is the same order reads aligned to each chromosome appear in the BAM file. You have to make sure you create the .genome file from the same reference genome FASTA file used to create the alignment index.

ADD COMMENTlink written 3.3 years ago by James Ashmore2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1583 users visited in the last hour