Question: Exon Range Vs. Exonstart In Ucsc
1
gravatar for Max
7.0 years ago by
Max140
Max140 wrote:

This is a revised version of an earlier query that may not have been stated very clearly:

I have noticed a mismatch between the coordinates given by ExonStart / ExonEnd and exon range from the UCSC genome browser's annotation of hg19 human reference genome.

Specifically, the exonStarts and exonEnds coordinates that are given do not match the exon range given when the sequences are called. Typically, the exonStarts coordinate is 1 nucleotide prior to the exonStarts, as in the example below:

name    chromosome    strand    exonStarts     exonEnds      exonFrame
NM_030806    chr1           +    184559872        184559949   1

While the range is 
  >hg19_refGene_NM_030806_2 range=chr1:184559873-184559949 5'pad=0 3'pad=0 strand=+ repeatMasking=none
GAAAAAAGTGCCAGCTCAAATGTAAGACTTAAAACTAATAAAGAGGTTCCGGGATTAGTTCATCAACCCAGAGCAAA

Usually the mismatch between exonStarts and range is +1 nucleotide, but sometimes it is more than this. What is the reason for the discrepancy between range and exonStarts/exonEnds, and which number is the actual coordinate of the first nucleotide in the exon?

exon ucsc • 1.8k views
ADD COMMENTlink modified 7.0 years ago by brentp23k • written 7.0 years ago by Max140
0
gravatar for brentp
7.0 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

The first format is 0-based start (https://genome.ucsc.edu/FAQ/FAQformat.html#format9) the fasta header is showing it as 1-based start. 0-based start means that the first base for the entire chromosome is 0. 1-based start means that it is 1.

The end coordinate remains the same because for 0-based systems, it is non-inclusive (doesn't include the end) (https://genome.ucsc.edu/FAQ/FAQformat.html#format1).

This is because 0-based is nice for programmers and computers and 1-based is nice for "normal people".

ADD COMMENTlink modified 7.0 years ago • written 7.0 years ago by brentp23k

Thanks.

However, if that were the case, wouldn't both the exonStarts and exonEnds be -1 with respect to the FASTA coordinates? Instead, the exonStart is -1 with respect to fasta, while the exonEnds match.

Also, which of the coordinate systems (0 start or 1 start) is consistent with ensembl coordinates?

ADD REPLYlink modified 7.0 years ago • written 7.0 years ago by Max140

I updated my answer to include more info about that.

ADD REPLYlink written 7.0 years ago by brentp23k

Thanks for the update. I assume that the coordinates in ensembl annotation start at 1?

ADD REPLYlink written 7.0 years ago by Max140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1564 users visited in the last hour