I am using the BIOPYTHON library Bio to retrieve sequences from Entrez, I need to know the ending position of 5' UTR and the starting position of 3' UTR. How do I get them?
Are you fetching the records in the Genbank format? If yes then you can parse the features to identify the UTR tags.
Can you tell me any example of how to get or tutorial them? it is driving me mad
Following code can be modified to check for UTR instead of CDS for example:
from Bio import SeqIO
records = SeqIO.parse("sequence.gb", "genbank")
for seq_record in records:
for feature in seq_record.features:
if 'CDS' in (feature.type):
if 'methyltransferase' in str(feature.qualifiers["product"]):
See more on the BioPython webpages: http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc130
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy