Detailed Explanation Of Ucsc Orientation / Coordinate Conversion For Minus Stranded Sequences?
1
1
Entering edit mode
10.3 years ago
user ▴ 940

can someone point to a detailed explanation of how UCSC genome browser interprets coordinates for, and displays, minus strand sequences? It's very confusing. For example this genomic GFF coordinate from mm9 genome:

chr1:3206080-3206102:-

Pulls out the sequence:

>seq1
AATACAAGGAGCCGCAATGTCCA

which is correct. However, in the genome browser it appears as the reverse complement going from right to left

>seq2
TGGACATTGCGGCTCCTTGTATT

Questions are:

  1. why does it UCSC reverse complement it and display it right to left in this way? What is the logic behind this? Is there a way to reverse it, such that the sequence displayed is the sequence pulled from the genome (i.e. seq1)
  2. I thought UCSC is always BED coordinate based, which is a 0-based coordinate system not 1-based. In that case I would have expected the sequence determined by chr1:3206080-3206102 (meant to be a GFF coordinate) to be one base shorter than the sequence I get from UCSC, i.e. chr1:3206080-3206102:- in BED should have a minus 1 start, yielding chr1:3206079-3206102:-

any intuitive explanation of how to think about UCSC orientation display choices and how that connects to sequences and their reverse complement on minus strand (and where/when BED versus GFF conventions are used) would be very helpful.

strand dna alignment genome-browser • 6.7k views
ADD COMMENT
6
Entering edit mode
10.3 years ago
Mary 11k

Huh. Yeah, that's not well documented, you are right. I looked in a few places where I expected to see it, but can't find a place to give you a handy link on the viewer reverse strategies. I'll tell you what I know, but that is not the official word and they may have better details about the underlying decisions. And it may just be that I haven't had enough coffee to find it in the documentation this morning.

On the graphical display: Yes, it displays the "upper" strand 5'-3' by default. There are 2 ways to flip it. One is an old way and slightly hard to find. The other is the "reverse" button on the mid-page buttons. That's newer and better, I think, because I really like the way it load the track labels to the other side to remind you that you have flipped the graphical view. But I'll show you both on the image below.

Old way: when you turn on the "base position" track to "full", and you are zoomed in enough to see the 3 frame translation, you will also have access to a little teeny arrow. Clicking that arrow will flip the display.

New way: click that mid-page "reverse" button. The flipped view isn't shown below, but you'll see it immediately if you just click that.

How to flip the UCSC graphical display

On the 0-1 based coordinates, welcome to the club, you are now officially one of us. This bites everyone in the a** at some point in their bioinformatics career. Here's their documentation on that: http://genome.ucsc.edu/FAQ/FAQtracks.html#tracks1

So part of the answer is just: that's the way it is. But there are tricks. Does that help?

ADD COMMENT
0
Entering edit mode

I found the button but it's still confusing for the task I'm trying to accomplish. I want to look for presence of a motif in a genomic region (say 5' utr). For minus strand, I want to detect the motif in the correct orientation and make a BED track loaded into the UCSC browser that will show where the motif is. Normally when I pull minus strand genomic sequences from genome I use bedtools, and I tell it to reverse complement the minus strand sequences. then I look for motif. I want that to match up with what UCSC shows. So I guess I should load the BED track and then reverse the display?

ADD REPLY

Login before adding your answer.

Traffic: 2521 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6