Linking NCBI gene ID's to nucleotide sequence
Entering edit mode
6.9 years ago

I've been trying to get the nucleotide sequence for a list of genes on the command line, for example for the gene LOC100136426 I would like to get the following:

I have been trying to use eutils to do this with the following command:

$ esearch -db nucleotide -query "LOC100136426" | efetch -format fasta

My issue is that this gives three results, where as I only want the nucleotide sequence for the gene so that I can perform a blast.

I'm not quite sure how to implement a filter in the pipe to only give the gene sequence. I'd greatly appreciate any help or even suggestions to alternate methods to find this information.

RNA-Seq NCBI eutils • 1.9k views
Entering edit mode
6.9 years ago

You need to query the gene database not the nucleotide one or find out what is the corresponding identifier in the nucleotide database.

Entering edit mode

I couldn't get the pipe to return the nucleotide sequence when using the gene database, although this would correctly identify the gene. I've corrected the command using the GI number as you suggested and this returns the correct result, thanks for the help!

Is there a way to use the gene database to return the fasta sequence? I have tried:

$ esearch -db gene query "LOC100136426" | efetch -format fasta

This didn't return the fasta sequence


Login before adding your answer.

Traffic: 1537 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6