Is Bam One-Based Or Zero-Based?
3
4
Entering edit mode
9.2 years ago

Reading section 1.2 of the SAM specification, it says that BAM is 0-based. Yet some folks I know who work with sequencing data in that format say it is 1-based. Who is correct?

bam • 8.2k views
ADD COMMENT
12
Entering edit mode
9.2 years ago

The spec says that the SAM format is one based and that the BAM format is zero based.

But this latter only matters if you access the file directly - if you access a BAM file via a tool like samtools that turns BAM into SAM then it will be turned into a 1 based format.

ADD COMMENT
0
Entering edit mode

Sneaky! Thanks for the explanation...

ADD REPLY
0
Entering edit mode

So far, i though BAM is 0-based. But when i look at BAM description at IGV

http://www.broadinstitute.org/igv/BAM

It says BAM is 1-based. Which is confusing.

From ENCODE mapped BAM file, what is the best way to manually confirm if the BAM file is in 0-based or 1-based. For eg: ENCODE CAGE data, is this 0-based or 1-based

# Download Encode BAM
wget
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeRikenCage/wgEncodeRikenCageHelas3CellPapAlnRep1.bam
wget
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeRikenCage/wgEncodeRikenCageHelas3CellPapAlnRep2.bam

 

 

 

ADD REPLY
3
Entering edit mode

it only needs to be treated as zero based if you write a tool that opens the binary BAM file directly and you acess and extract the field that contains the coordinate itself. For example you use a programming API in python, java or C. Then you need to regard it as 0 based. In any other interpretation the conversion is being done for you, when IGV shows you a BAM file it has already converted it to SAM format that is 1 based. 

The IGV help file is misleading the file is not 1 based, what they show is one based. Similarly when you open a BED file in IGV (BED is also zero based) you will see that it is drawn as a 1 based file. But the file is of course still zero based. 

 

They just convert all data onto the same coordinate system.

ADD REPLY
5
Entering edit mode
9.2 years ago
JC 12k

BAM is 0b, SAM (which is what we can read on the screen) is 1b, maybe that's why the confusion.

ADD COMMENT
4
Entering edit mode
9.2 years ago

As you point out, the SAM spec is unambiguous on this; BAM data are 0-based and SAM data are 1-based (sections 1.2 and 3.2). A feature at base 0 in a BAM file will be at base 1 when the data are exported as SAM.

ADD COMMENT

Login before adding your answer.

Traffic: 1540 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6