Question: Gene FASTA sequence identifiers in batch entrez
0
gravatar for easyshortcatchy
22 months ago by
easyshortcatchy0 wrote:

I am trying to acquire the genomic fasta sequence for a large list of orthologous gene ids from NCBI. I figured I could convert them to some other identifier, using biopython or mygene, that I could feed in to batch entrez to acquire the fasta sequence of the whole gene sequence; however the accession that matches to the gene fasta is the accession for the whole chromosome/scaffold So I have a few questions, what is the syntax that batch entrez accepts subregion arguements? is there a better identifier that maps to the genomic sequence? is there an easier way to accomplish my task?

sequence • 668 views
ADD COMMENTlink modified 22 months ago • written 22 months ago by easyshortcatchy0

I could be wrong but I don't think NCBI provides genomic sequences for genes. You'd have to extract it from the chromosome sequence but first, you'd have to define what region you want because I don't think NCBI provides coordinates for genes, only for RefSeq sequences.

ADD REPLYlink written 22 months ago by Jean-Karim Heriche18k
0
gravatar for easyshortcatchy
22 months ago by
easyshortcatchy0 wrote:

It's okay, I found out that Geneious can do what I want in 5 seconds and I wasted a bunch of hours for nothing, although I am curious how Geneious does it, they probably just know how to parse the xml files properly and query further as needed which is what I was trying to do at one point.

ADD COMMENTlink written 22 months ago by easyshortcatchy0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 893 users visited in the last hour