Question: Negative Strand Coordinates in Fasta Files?
3.7 years ago
United States
Hi, I've obtained the arabidopsis gene list in fasta format from ensembl and I was wondering whether coordinates from genes on the negative strand such as I think, this one..

>ATMG01410.1 cdna:known chromosome:TAIR10:Mt:366086:366700:-1 gene:ATMG01410 transcript:ATMG01410.1 description:"Uncharacterized mitochondrial protein AtMg01410"

 have to be converted to positive strand coordinates or are they all the same positive strand coordinates? I assume the coordinates are 366086:366700. Do you have to do anything with them or can they simply be plugged right into a genome browser?


3.7 years ago
United States
Usually positions are relative to the positive strand (or at least they are in most mouse databases - I had the same question as you a while ago). You can validate this by extracting the sequence based on those coordinates and seeing whether or not you need to reverse complement them to get the correct sequence.

3.7 years ago
Co-ordinates are usually only given to one strand. You can use them to view the region, and browsers generally have an option to reverse the sequence being displayed - that should get you the co-ordinates decreasing as opposed to increasing when you move left to right. You'll still need to complement the sequence in the region once you extract it though (if you're going from co-ordinates to sequence).

While for most operations you'll need the reverse complement of the reference sequence, if you're trying to represent a variant you'd have to go with the actual reference strand base.

