I would like to extract locations (starting point and ending point) for some regions characterizing genes.
Using Biomart, for example, I can extract for a gene, Gene_Start_(bp), Gene_End_(bp), Transcription_Start_Site_(TSS), Exon_Chr_Start_(bp), Exon_Chr_End_(bp), 5'_UTR_Start, 5'_UTR_End, 3'_UTR_Start, 3'_UTR_End.
In Biomart, by convention, locations grow from left to right.
For a gene on the positive strand, it is quite trivial to find the first exon, and to compute the gene body (from the end of the first exon to the beginning of the rightmost 3'_UTR_Start).
But how to deal with genes on the negative strand?
I know that in nature, genes on the negative strand are transcribed from right to left.What should I consider as the first exon for this gene on the negative strand?