Question: What is the first exon in negative strand genes?
1
gravatar for Zakaria Benmounah
4.6 years ago by
United Kingdom
Zakaria Benmounah50 wrote:

Hello Biostars!

I would like to extract locations (starting point and ending point) for some regions characterizing genes.

Using Biomart, for example, I can extract for a gene, Gene_Start_(bp), Gene_End_(bp), Transcription_Start_Site_(TSS), Exon_Chr_Start_(bp), Exon_Chr_End_(bp), 5'_UTR_Start, 5'_UTR_End, 3'_UTR_Start, 3'_UTR_End.
In Biomart, by convention, locations grow from left to right.

For a gene on the positive strand, it is quite trivial to find the first exon, and to compute the gene body (from the end of the first exon to the beginning of the rightmost 3'_UTR_Start).
But how to deal with genes on the negative strand?


I know that in nature, genes on the negative strand are transcribed from right to left.What should I consider as the first exon for this gene on the negative strand? 

cheers

gene • 2.7k views
ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by Zakaria Benmounah50
2
gravatar for Istvan Albert
4.6 years ago by
Istvan Albert ♦♦ 82k
University Park, USA
Istvan Albert ♦♦ 82k wrote:

Be careful not to confuse the coordinate system with the transcription direction. In "nature" all genes are transcribed in the same direction (5' to 3') relative to the template strand - nature does not know when it uses the positive strand or negative strand, it is all the same.

But  since the coordinate system is relative to the positive strand the data obtained for the negative strand is often flipped. I say often as the jury is still out there what one ought to get back when they query for "Gene Start" in a database - Should they get the leftmost coordinate of a gene (that happens to be  Gene Start on for genes on + strand and Gene End for genes on the - strand) or should they get the actual 5' end of the gene.

Basically you need to verify what your query returns and you may to either need reverse that order or not ... depending on what the query does.  

ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by Istvan Albert ♦♦ 82k

True. My experience out of the genome browser is, it returns the actual start of the exon, which I think is the right 5'. Quick question: What do you mean by "In "nature" all genes are transcribed in the same direction (5' to 3') relative to the template strand"?

ADD REPLYlink written 4.6 years ago by RamRS25k

I just quoted the word the original poster used "in nature"- trying to emphasize that when the transcription takes place there is no left and right. That only comes from what we chose as coordinate system.

(But then calling it just 5'  an 3'  in can turn also be confusing as now one needs to state 5' of what? The RNA is produced in 5' -> 3' but the polymerase traverses in 3' to 5')

ADD REPLYlink written 4.6 years ago by Istvan Albert ♦♦ 82k

Thank you - I did not know that RNA polymerase transcribes in a strand insensitive manner. How is the information passed on to ribosomes on the  translation start sites?

ADD REPLYlink written 4.6 years ago by RamRS25k

Wait that is not what I meant to say!  The transcribed sequence is always in 5' to 3'. What it does not do is go left to right or right to left, that is all that I meant. 

If one were to obtain the sequence from the reverse strand then one would not need to go "backwards". It is only when we have a coordinate system relative to the forward strand that we need to keep track of reversing it.

ADD REPLYlink written 4.6 years ago by Istvan Albert ♦♦ 82k

OK, so polymerase always transcribes 5' to 3' on the relevant template strand (exon 1 to exon-n). Our co-ordinate system is based on one strand, which is where the confusion originates - on adapting indexes based on forward strand.

ADD REPLYlink written 4.6 years ago by RamRS25k

Many thanks for your answers,

to be more precise this is the situation, given these two genes:

chromosome hgnc start end strand
chr13 HMGA1P6 23708313 23708703 1
chr13 RNY3P4 23726725 23726825 -1

 

The question is from which direction these genes are transcripted (from the start to the end for +1. From the end to the start for -1)?

PS:the data was retrieved from Biomart;

Cheers!

ADD REPLYlink written 4.6 years ago by Zakaria Benmounah50
1

strand = 1 is transcribed from 'start' to 'end'. strand=-1 is transcribed from 'end' to 'start'

ADD REPLYlink written 4.6 years ago by RamRS25k
0
gravatar for RamRS
4.6 years ago by
RamRS25k
Houston, TX
RamRS25k wrote:

Exon 1 is always the exon that is transcribed first (the most 5'). This makes a lot of calculations tricky, but IIRC, the exon sequences themselves are usually retrieved the right way (right to left). You'll only need to manipulate the co-ordinates for any calculation.

ADD COMMENTlink written 4.6 years ago by RamRS25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1845 users visited in the last hour