Question: Genes On Gi Sequences From Ncbi
gravatar for thedusseldorfer
6.0 years ago by
thedusseldorfer10 wrote:

I am looking for a comprehensive list of gene positions on GI sequences. I have found gene2accession, but this file only seems to contain start and stop positions for particular genes.

To give a concrete example, gene 5908888 (“Fphi_0526 proline:Na+ symporter [ Francisella philomiragia subsp. philomiragia ATCC 25017 ]”) appears in both CP000937.1 and NC_010336.1 (according to gene2accesion, and this is consistent with the corresponding GenBank entries). However, it is only for NC_010336.1 that gene2accession lists coordinates, although coordinates are specified in the GenBank entry for CP000937.1 too (not using the gene ID, interestingly, but the correct locus_tag, and the positions are identical).

Do I have to download the whole of genbank in order to achieve that goal?

ncbi sequence • 1.4k views
ADD COMMENTlink modified 5.7 years ago by Biostar ♦♦ 20 • written 6.0 years ago by thedusseldorfer10

CP000937 is the GenBank entry, NC_010336 is RefSeq and they are not the same database. you can read more here:

ADD REPLYlink written 6.0 years ago by Asaf5.5k
gravatar for thedusseldorfer
5.9 years ago by
thedusseldorfer10 wrote:

I have been in contact with NCBI - there is no existing solution and indeed, I need to download the whole of GenBank to achieve my goal -- or at least query all GenBank entries of interest with the NCBI eUtils.

ADD COMMENTlink written 5.9 years ago by thedusseldorfer10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1062 users visited in the last hour