Negative Sense Exon Strand Coordinates In Ucsc
1
1
Entering edit mode
9.0 years ago
Max ▴ 140

I have a question about how to interpret the coordinates of negative-sense exons in the UCSC genome browser.

Specifically, suppose that I'm given exon coordinates 10-16 and sequence ATGCCT (negative sense). Presumably, even though coordinates 10-16 are the negative sense strand, we're given the start codon in the opposite direction. Does this mean that the actual sequence in positions 10-16 is the reverse complement of ATGCCT?

This is critical for me, because I have vcf files that call a mutation in (say) position 11. Is this a mutation in the "T" of this strand, or of its reverse complement (G as complement to C)?

ucsc • 3.5k views
2
Entering edit mode
9.0 years ago
Emily 23k

Usually, this is what it means:

Forward-> NNNNNNNNNNAGGCAT
Reverse-> NNNNNNNNNNTCCGTA


The sequence of the exon is ATGCCT, therefore the sequence on the negative strand is ATGCCT. 'Start' and 'end' in this context really mean 'most 5' wrt the positive strand and 'most 3' wrt the positive strand'. So the most 5' part is actually the end of the exon, and the most 3' part is the start of the exon.

In terms of looking at the effect of mutations, you're better off using a tool like the VEP, which will tell you which exons are affected and how. Saves a lot of manual effort on your part.

0
Entering edit mode

Thanks you for the reply - UCSC's annotation is rather muddled. I already have a lot of scripts written that "commit" me to doing it the long way, so if I understand you correctly, in my example, a mutation in position X from a vcf means position X in the forward strand, so to reconstruct the CDS, I need the reverse complement. Correct?

In other words, if I'm given coordinates 10-15 for ATGCCT, position 11 corresponds to the G in the reverse complement forward strand, not to the T (nor its complementary A).

0
Entering edit mode

Yes, you're right.