Question: Is Bam One-Based Or Zero-Based?
3
gravatar for Alex Reynolds
8.3 years ago by
Alex Reynolds31k
Seattle, WA USA
Alex Reynolds31k wrote:

Reading section 1.2 of the SAM specification, it says that BAM is 0-based. Yet some folks I know who work with sequencing data in that format say it is 1-based. Who is correct?

bam • 7.0k views
ADD COMMENTlink written 8.3 years ago by Alex Reynolds31k
10
gravatar for Istvan Albert
8.3 years ago by
Istvan Albert ♦♦ 85k
University Park, USA
Istvan Albert ♦♦ 85k wrote:

The spec says that the SAM format is one based and that the BAM format is zero based.

But this latter only matters if you access the file directly - if you access a BAM file via a tool like samtools that turns BAM into SAM then it will be turned into a 1 based format.

ADD COMMENTlink written 8.3 years ago by Istvan Albert ♦♦ 85k

Sneaky! Thanks for the explanation...

ADD REPLYlink written 8.3 years ago by Alex Reynolds31k

So far, i though BAM is 0-based. But when i look at BAM description at IGV

http://www.broadinstitute.org/igv/BAM

It says BAM is 1-based. Which is confusing.

From ENCODE mapped BAM file, what is the best way to manually confirm if the BAM file is in 0-based or 1-based. For eg: ENCODE CAGE data, is this 0-based or 1-based

# Download Encode BAM
wget
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeRikenCage/wgEncodeRikenCageHelas3CellPapAlnRep1.bam
wget
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeRikenCage/wgEncodeRikenCageHelas3CellPapAlnRep2.bam

 

 

 

ADD REPLYlink written 5.7 years ago by Chirag Nepal2.2k
2

it only needs to be treated as zero based if you write a tool that opens the binary BAM file directly and you acess and extract the field that contains the coordinate itself. For example you use a programming API in python, java or C. Then you need to regard it as 0 based. In any other interpretation the conversion is being done for you, when IGV shows you a BAM file it has already converted it to SAM format that is 1 based. 

The IGV help file is misleading the file is not 1 based, what they show is one based. Similarly when you open a BED file in IGV (BED is also zero based) you will see that it is drawn as a 1 based file. But the file is of course still zero based. 

 

They just convert all data onto the same coordinate system.

ADD REPLYlink modified 5.7 years ago • written 5.7 years ago by Istvan Albert ♦♦ 85k
4
gravatar for JC
8.3 years ago by
JC12k
Mexico
JC12k wrote:

BAM is 0b, SAM (which is what we can read on the screen) is 1b, maybe that's why the confusion.

ADD COMMENTlink written 8.3 years ago by JC12k
3
gravatar for biobot 0.0.77.a.1099
8.3 years ago by
UK
biobot 0.0.77.a.10996.1k wrote:

As you point out, the SAM spec is unambiguous on this; BAM data are 0-based and SAM data are 1-based (sections 1.2 and 3.2). A feature at base 0 in a BAM file will be at base 1 when the data are exported as SAM.

ADD COMMENTlink written 8.3 years ago by biobot 0.0.77.a.10996.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1586 users visited in the last hour