Getting GI number from NCBI database through code
1
0
Entering edit mode
3.6 years ago
erans995 • 0

Hello As part of a college project, I have to write a program that finds similar FASTA sequences to a one the user chooses. In my program, say the user enters "cat", I have to display to him all the relevant entries present in the DB, and let him choose one. I have a script that outputs the FASTA data of a certain entry in the NCBI database given its accession number.

I have found the following perl script that converts GI to accession number:

use LWP::Simple;
$gi_list = '24475906,224465210,50978625,9507198'; #assemble the URL$base = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/';
$url =$base . "efetch.fcgi?db=nucleotide&id=$gi_list&rettype=acc"; #post the URL$output = get($url); print "$output";


However, I haven't found a way to retrieve the GI from the database through code. Thank you for taking the time to read this, I hope you will be able to help me!

ncbi fasta gi code accession number • 1.5k views
0
Entering edit mode

I normally don't post replies to homework or project-based questions, but I'll simply point to this post (and indicate you should point this out to your course instructor, it's been two years since the original announcement):

https://www.ncbi.nlm.nih.gov/books/NBK431010/#news_03-02-2016-phase-out-of-GI-numbers

0
Entering edit mode

Okay thanks for the update. Let me rephrase my question: how can I retrieve the accession number of a certain entry through code?

0
Entering edit mode

1
Entering edit mode
3.6 years ago
GenoMax 109k

NCBI deprecated use of GI numbers in 2016. You should switch your code to using Accession numbers.

NCBI Unix utils allow you to query using gi and retrieve accessions numbers.

$esearch -db nuccore -query "24475906" | efetch -format acc NM_009417.2  ADD COMMENT 0 Entering edit mode But that's the point, how can I retrieve the GI through code? I don't know it... ADD REPLY 0 Entering edit mode I thought you already had gi numbers. Using your "cat" example you can get accession numbers like this. $ esearch -db nuccore -query "cat" | efetch -format acc

AFHV02000288.1
AFHV02000289.1
AFHV02000291.1
AFHV02000292.1
AFHV02000293.1
AFHV02000294.1


I will leave it to you to figure out how to change the query and how to use this method to do URL based searches.

0
Entering edit mode

Okay thank you very much, I'll try to figure out the rest by myself