7.3 years ago
Mateusz ▴ 70

Hi,

I want to use gene tables from UCSC to retrieve genes, exons and introns separately. What puzzles me is that exon frames contains values ranging from -1 to 2, so in total 4 'different' registers. For instance:

NM_001308203 (...) -1,0,1,2,0,0,1,0,1,2,1,1,1,0,1,1,2,2,0,2,1,1


I've skipped some of the fields due to improved readability.

If our exon localization on the chromosome is let's say [451:987] with frame 1 it's obvious it will be exonStart:exonEnd 452:987. But what happens if exon frame is -1? It will be then 450:987? It seems so, but then why isn't it then in the table - exonStart:exonEnd 450:987 with exon frame 0?

I guess it's quite simple question but I just want to make sure I understand completely their denominations.

7.3 years ago

From the description of the refGene table:

"Exon frame {0,1,2}, or -1 if no frame for exon"

Yeah, it'd be nice if they'd used a "." or NA or something. Presumably that's UTR.

Thanks! I was going through their websites but didn't saw it.

7.3 years ago
Martombo ★ 3.0k

That means that the start codon is on the second exon, therefore the first exon contains only the 5'UTR and therefore doesn't really have a "frame".

On genbank you can see that the CDS starts at position 218 and the first exon only goes on until position 104.

7.3 years ago

-1 means (for this transcript), exon1 is in UTR. 0, 1 and 2 represent frame 1, 2 and 3 respectively (0 based indexing-UCSC).