Question: Python Framework For Converting Genomic To Protein Coordinates
gravatar for user
7.8 years ago by
United States
user850 wrote:

Is there a framework for converting between genomic coordinates and protein coordinates, given a transcript (i.e. a list of exon coordinates)? To go from CDS coordinates to amino-acid coordinates.

The trick is to do this correctly even when the transcript is on the minus strand, which would mean the highest coordinate (not lowest) indicates the start amino acid.

I saw the BioPython related frameworks (like this and this but I'd prefer not to rely on all of BioPython just for the coordinate transform. I also am not sure how BioPython handles the strandedness.

Apparently PyGr can do this but with ORF containing transcripts but I've never seen an example and cannot see how it can be done from the documentation.

Any pointers to frameworks that do this correctly or examples would be helpful.

ADD COMMENTlink modified 7.8 years ago by Pierre Lindenbaum129k • written 7.8 years ago by user850
gravatar for Pierre Lindenbaum
7.8 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

for the algorithm protein->genomic it using the UCSC knownGene database. see this java code:

for the algorithm genomic->protein: see this previous post: How to calculate the protein change and codon position within a nucleotide sequence of a single nucleotide substitution?

ADD COMMENTlink modified 7.8 years ago • written 7.8 years ago by Pierre Lindenbaum129k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1005 users visited in the last hour