Is Bam One-Based Or Zero-Based?
3
4
Entering edit mode
9.9 years ago

Reading section 1.2 of the SAM specification, it says that BAM is 0-based. Yet some folks I know who work with sequencing data in that format say it is 1-based. Who is correct?

bam • 9.1k views
13
Entering edit mode
9.9 years ago

The spec says that the SAM format is one based and that the BAM format is zero based.

But this latter only matters if you access the file directly - if you access a BAM file via a tool like samtools that turns BAM into SAM then it will be turned into a 1 based format.

0
Entering edit mode

Sneaky! Thanks for the explanation...

0
Entering edit mode

So far, I though BAM is 0-based. But when I look at BAM description at IGV, it says BAM is 1-based, which is confusing. From ENCODE mapped BAM file, what is the best way to manually confirm if the BAM file is in 0-based or 1-based. For eg: ENCODE CAGE data, is this 0-based or 1-based

# Download Encode BAM
wget
wget

3
Entering edit mode

It only needs to be treated as zero based if you write a tool that opens the binary BAM file directly and you access and extract the field that contains the coordinate itself. For example you use a programming API in python, java or C. Then you need to regard it as 0 based. In any other interpretation the conversion is being done for you, when IGV shows you a BAM file it has already converted it to SAM format that is 1 based.

The IGV help file is misleading the file is not 1 based, what they show is one based. Similarly when you open a BED file in IGV (BED is also zero based) you will see that it is drawn as a 1 based file. But the file is of course still zero based.

They just convert all data onto the same coordinate system.

5
Entering edit mode
9.9 years ago
JC 13k

BAM is 0b, SAM (which is what we can read on the screen) is 1b, maybe that's why the confusion.

4
Entering edit mode
9.9 years ago

As you point out, the SAM spec is unambiguous on this; BAM data are 0-based and SAM data are 1-based (sections 1.2 and 3.2). A feature at base 0 in a BAM file will be at base 1 when the data are exported as SAM.