Gene FASTA sequence identifiers in batch entrez
1
0
Entering edit mode
6.9 years ago

I am trying to acquire the genomic fasta sequence for a large list of orthologous gene ids from NCBI. I figured I could convert them to some other identifier, using biopython or mygene, that I could feed in to batch entrez to acquire the fasta sequence of the whole gene sequence; however the accession that matches to the gene fasta is the accession for the whole chromosome/scaffold So I have a few questions, what is the syntax that batch entrez accepts subregion arguements? is there a better identifier that maps to the genomic sequence? is there an easier way to accomplish my task?

sequence • 1.9k views
ADD COMMENT
0
Entering edit mode

I could be wrong but I don't think NCBI provides genomic sequences for genes. You'd have to extract it from the chromosome sequence but first, you'd have to define what region you want because I don't think NCBI provides coordinates for genes, only for RefSeq sequences.

ADD REPLY
0
Entering edit mode
6.9 years ago

It's okay, I found out that Geneious can do what I want in 5 seconds and I wasted a bunch of hours for nothing, although I am curious how Geneious does it, they probably just know how to parse the xml files properly and query further as needed which is what I was trying to do at one point.

ADD COMMENT

Login before adding your answer.

Traffic: 1338 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6