Question: E-utilities for obtain gene sequences from the gene database
0
gravatar for dllopezr
13 months ago by
dllopezr40
dllopezr40 wrote:

Hi everyone

I need to download all gene sequences from a query gene in gene ncbi database through e-utilites in linux command line. The next command (adapted from ncbi example) works for gene to protein:

esearch -db gene -query "nifH AND Sinorhizobium meliloti 1021 [orgn]" | elink -target protein -name gene_protein_refseq | efetch -format fasta

But when I try this:

esearch -db gene -query "nifH AND Sinorhizobium meliloti 1021 [orgn]" | elink -target nuccore -name gene_nuccore_refseqgene | efetch -format fasta

I obtain this error msg:

QueryKey value not found in fetch input

As a note: ncbi examples of how to do this search don't exist, so I am question myself if it is possible to conect gene db with nucleotide db, of retrieve gene sequences from gene database as well

Thank you for your suggestions

nucleotide ncbi gene • 396 views
ADD COMMENTlink modified 13 months ago by genomax65k • written 13 months ago by dllopezr40
1
gravatar for genomax
13 months ago by
genomax65k
United States
genomax65k wrote:

Extending query you were trying:

esearch -db gene -query "nifH AND Sinorhizobium meliloti 1021 [orgn]" | efetch -format docsum | xtract -pattern GenomicInfoType -element ChrAccVer -element ChrStart -element ChrStop |xargs -n 3 sh -c 'efetch -db nuccore -id "$0" -seq_start "$1" -seq_stop "$2" -format fasta'

Should get you

>NC_003037.1:453555-454448 Sinorhizobium meliloti 1021 plasmid pSymA, complete sequence
GATGGCAGCTCTGCGTCAGATCGCGTTCTACGGTAAGGGGGGTATCGGCAAGTCCACGACCTCCCAAAAT
ACACTCGCCGCGCTTGTCGACCTGGGGCAAAAGATCCTTATTGTCGGCTGCGATCCGAAAGCGGACTCCA
CGCGCCTCATCCTGAACGCAAAGGCACAGGACACCGTACTGCATCTTGCGGCAACCGAAGGTTCGGTCGA
AGACCTCGAGCTCGAGGACGTGCTCAAAGTGGGTTACAGAGGCATCAAGTGCGTGGAGTCCGGTGGCCCA
GAGCCGGGCGTCGGCTGCGCCGGACGCGGCGTTATCACCTCGATCAACTTCCTGGAAGAGAACGGCGCTT
ACAACGATGTCGATTACGTCTCATACGACGTGCTAGGGGACGTAGTATGCGGCGGCTTTGCGATGCCTAT
TCGCGAAAACAAGGCTCAGGAAATCTACATCGTCATGTCCGGTGAGATGATGGCGCTCTATGCCGCCAAC
AACATCGCGAAGGGTATCCTGAAGTACGCCCATGCGGGCGGCGTGCGGCTGGGGGGGTTGATTTGCAACG
AGCGCCAGACCGATCGGGAGCTCGACCTCGCCGAGGCACTTGCCGCCCGCCTCAATTCCAAGCTCATCCA
CTTCGTGCCGCGCGACAATATCGTTCAGCACGCAGAGCTCAGAAAGATGACAGTGATCCAATATGCGCCG
AACTCTAAGCAAGCCGGGGAATATCGCGCCCTGGCTGAAAAGATCCATGCAAATTCCGGCCGAGGCACCG
TCCCTACACCGATCACTATGGAGGAACTGGAGGACATGCTGCTCGACTTTGGAATCATGAAGAGCGACGA
GCAGATGCTTGCCGAACTCCACGCCAAGGAAGCCAAGGTAATAGCCCCCCACTG
ADD COMMENTlink written 13 months ago by genomax65k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2041 users visited in the last hour