Using efetch(biopython) to get the fasta file of an mRNA, as well as the exon locations
5.2 years ago
Jacob ▴ 10

I want to use some type of e-utility tool to obtain the fasta mRNA sequence file of a gene. First I'm getting the uid from Gene with the code below

from Bio import Entrez"
handle = Entrez.esearch(db="gene", term="Acan[Gene Name] AND Homo sapiens[Organism]", rettype='fasta')"
record =
id = record["IdList"]

Then I would like to use this uid to get the fasta sequence for the mRNA gene. I would also like to obtain the bp positions of the exons in this gene and the bp positions of the longest off. This is how I'm trying to use efetch

Entrez.efetch(db="nucleotide", term=id[0]+"[uid]", rettype="fasta")

It is not working and I don't know how to specify to get the exon positions either. I know the exon positions are there because I can see them on the website.
biopython python efetch entrez • 2.5k views
5.2 years ago
Ben ▴ 50

Firstly, you should have the GTF/GFF and fasta files of the Reference genome; Then you can extract the cDNA sequence of mRNA with custome script; Last, you should convert the cDNA sequence to mRNA sequence.

Is there a way to get all the exon regions from that website using a command line tool?


