Question: Linking NCBI gene ID's to nucleotide sequence
gravatar for jared.j.tromp
3.9 years ago by
jared.j.tromp0 wrote:

I've been trying to get the nucleotide sequence for a list of genes on the command line, for example for the gene LOC100136426 I would like to get the following:

I have been trying to use eutils to do this with the following command:

$ esearch -db nucleotide -query "LOC100136426" | efetch -format fasta

My issue is that this gives three results, where as I only want the nucleotide sequence for the gene so that I can perform a blast.

I'm not quite sure how to implement a filter in the pipe to only give the gene sequence. I'd greatly appreciate any help or even suggestions to alternate methods to find this information.

eutils rna-seq ncbi • 1.3k views
ADD COMMENTlink modified 3.9 years ago by Jean-Karim Heriche24k • written 3.9 years ago by jared.j.tromp0
gravatar for Jean-Karim Heriche
3.9 years ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche24k wrote:

You need to query the gene database not the nucleotide one or find out what is the corresponding identifier in the nucleotide database.

ADD COMMENTlink written 3.9 years ago by Jean-Karim Heriche24k

I couldn't get the pipe to return the nucleotide sequence when using the gene database, although this would correctly identify the gene. I've corrected the command using the GI number as you suggested and this returns the correct result, thanks for the help!

Is there a way to use the gene database to return the fasta sequence? I have tried:

$ esearch -db gene query "LOC100136426" | efetch -format fasta

This didn't return the fasta sequence

ADD REPLYlink written 3.9 years ago by jared.j.tromp0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2645 users visited in the last hour