chromStart - The starting position of the feature in the chromosome or scaffold. The first base in a chromosome is numbered 0.
chromEnd - The ending position of the feature in the chromosome or scaffold. The chromEnd base is not included in the
display of the feature. For example, the first 100 bases of a chromosome are defined as chromStart=0, chromEnd=100, and span the bases numbered 0-99.
So BED coordinates are different from GFF3 for example? How to confidently reformat columns of start-stop intervals before extracting coordinates using BEDtools?
I have found it useful to think of bed coordinates as marking the spaces between the the bases, rather than the bases themselves. I will try to represent this:
| A | C | G | T | A | C | G | T |[?]
0 | 1 | 2 | 3 | [?]4 | 5 |[?] 6 | 7 [?]| 8
So if you wanted to describe the first base, it would be:
Another handy thing to note, you should always be able to subtract the start from the end to get the length of the bases you are describing, except in the case of insertions, which is the only case when you should have a start == stop. This should make sense in this scheme, since you are really only calling out a position between two bases, where a bit of sequence has been inserted.