Thanks for any answers in advance. I have a set of ~1100 protein GIs from mitochondrial and plastid genomes, and am trying to use perl with eutilities to get the corresponding coding DNA sequences. My strategy has been to use elink -> efetch. Elink gives me the associated GI number for the coding DNA sequence, but have run into the problem that the GI links to the entire genome sequence, and not the particular coding region.
gets me the GI 674840664, which is the GI for the genome, not the coding region of the protein of interest.
The coding region is specific by NC_024755.1:13737..14474, which is the RefSeq id.
Is there a way to link directly to the defined CDS region in the genome? I am likely missing something very obvious here.