Exon Range Vs. Exonstart In Ucsc
1
1
Entering edit mode
7.7 years ago
Max ▴ 140

This is a revised version of an earlier query that may not have been stated very clearly:

I have noticed a mismatch between the coordinates given by ExonStart / ExonEnd and exon range from the UCSC genome browser's annotation of hg19 human reference genome.

Specifically, the exonStarts and exonEnds coordinates that are given do not match the exon range given when the sequences are called. Typically, the exonStarts coordinate is 1 nucleotide prior to the exonStarts, as in the example below:

name    chromosome    strand    exonStarts     exonEnds      exonFrame
NM_030806    chr1           +    184559872        184559949   1

While the range is 
  >hg19_refGene_NM_030806_2 range=chr1:184559873-184559949 5'pad=0 3'pad=0 strand=+ repeatMasking=none
GAAAAAAGTGCCAGCTCAAATGTAAGACTTAAAACTAATAAAGAGGTTCCGGGATTAGTTCATCAACCCAGAGCAAA

Usually the mismatch between exonStarts and range is +1 nucleotide, but sometimes it is more than this. What is the reason for the discrepancy between range and exonStarts/exonEnds, and which number is the actual coordinate of the first nucleotide in the exon?

ucsc exon • 2.0k views
ADD COMMENT
0
Entering edit mode
7.7 years ago
brentp 23k

The first format is 0-based start (https://genome.ucsc.edu/FAQ/FAQformat.html#format9) the fasta header is showing it as 1-based start. 0-based start means that the first base for the entire chromosome is 0. 1-based start means that it is 1.

The end coordinate remains the same because for 0-based systems, it is non-inclusive (doesn't include the end) (https://genome.ucsc.edu/FAQ/FAQformat.html#format1).

This is because 0-based is nice for programmers and computers and 1-based is nice for "normal people".

ADD COMMENT
0
Entering edit mode

Thanks.

However, if that were the case, wouldn't both the exonStarts and exonEnds be -1 with respect to the FASTA coordinates? Instead, the exonStart is -1 with respect to fasta, while the exonEnds match.

Also, which of the coordinate systems (0 start or 1 start) is consistent with ensembl coordinates?

ADD REPLY
0
Entering edit mode

I updated my answer to include more info about that.

ADD REPLY
0
Entering edit mode

Thanks for the update. I assume that the coordinates in ensembl annotation start at 1?

ADD REPLY

Login before adding your answer.

Traffic: 1677 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6