Question: Linking NCBI gene ID's to nucleotide sequence
0
gravatar for jared.j.tromp
24 months ago by
jared.j.tromp0 wrote:

I've been trying to get the nucleotide sequence for a list of genes on the command line, for example for the gene LOC100136426 I would like to get the following:

https://www.ncbi.nlm.nih.gov/nuccore/NC_027322.1?report=fasta&from=3409007&to=3419329&strand=true

I have been trying to use eutils to do this with the following command:

$ esearch -db nucleotide -query "LOC100136426" | efetch -format fasta

My issue is that this gives three results, where as I only want the nucleotide sequence for the gene so that I can perform a blast.

I'm not quite sure how to implement a filter in the pipe to only give the gene sequence. I'd greatly appreciate any help or even suggestions to alternate methods to find this information.

eutils rna-seq ncbi • 728 views
ADD COMMENTlink modified 24 months ago by Jean-Karim Heriche18k • written 24 months ago by jared.j.tromp0
2
gravatar for Jean-Karim Heriche
24 months ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche18k wrote:

You need to query the gene database not the nucleotide one or find out what is the corresponding identifier in the nucleotide database.

ADD COMMENTlink written 24 months ago by Jean-Karim Heriche18k

I couldn't get the pipe to return the nucleotide sequence when using the gene database, although this would correctly identify the gene. I've corrected the command using the GI number as you suggested and this returns the correct result, thanks for the help!

Is there a way to use the gene database to return the fasta sequence? I have tried:

$ esearch -db gene query "LOC100136426" | efetch -format fasta

This didn't return the fasta sequence

ADD REPLYlink written 24 months ago by jared.j.tromp0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1125 users visited in the last hour