Question: E-utilities for obtain gene sequences from the gene database
0
gravatar for dllopezr
2.7 years ago by
dllopezr60
dllopezr60 wrote:

Hi everyone

I need to download all gene sequences from a query gene in gene ncbi database through e-utilites in linux command line. The next command (adapted from ncbi example) works for gene to protein:

esearch -db gene -query "nifH AND Sinorhizobium meliloti 1021 [orgn]" | elink -target protein -name gene_protein_refseq | efetch -format fasta

But when I try this:

esearch -db gene -query "nifH AND Sinorhizobium meliloti 1021 [orgn]" | elink -target nuccore -name gene_nuccore_refseqgene | efetch -format fasta

I obtain this error msg:

QueryKey value not found in fetch input

As a note: ncbi examples of how to do this search don't exist, so I am question myself if it is possible to conect gene db with nucleotide db, of retrieve gene sequences from gene database as well

Thank you for your suggestions

nucleotide ncbi gene • 780 views
ADD COMMENTlink modified 2.7 years ago by genomax92k • written 2.7 years ago by dllopezr60
1
gravatar for genomax
2.7 years ago by
genomax92k
United States
genomax92k wrote:

Extending query you were trying:

esearch -db gene -query "nifH AND Sinorhizobium meliloti 1021 [orgn]" | efetch -format docsum | xtract -pattern GenomicInfoType -element ChrAccVer -element ChrStart -element ChrStop |xargs -n 3 sh -c 'efetch -db nuccore -id "$0" -seq_start "$1" -seq_stop "$2" -format fasta'

Should get you

>NC_003037.1:453555-454448 Sinorhizobium meliloti 1021 plasmid pSymA, complete sequence
GATGGCAGCTCTGCGTCAGATCGCGTTCTACGGTAAGGGGGGTATCGGCAAGTCCACGACCTCCCAAAAT
ACACTCGCCGCGCTTGTCGACCTGGGGCAAAAGATCCTTATTGTCGGCTGCGATCCGAAAGCGGACTCCA
CGCGCCTCATCCTGAACGCAAAGGCACAGGACACCGTACTGCATCTTGCGGCAACCGAAGGTTCGGTCGA
AGACCTCGAGCTCGAGGACGTGCTCAAAGTGGGTTACAGAGGCATCAAGTGCGTGGAGTCCGGTGGCCCA
GAGCCGGGCGTCGGCTGCGCCGGACGCGGCGTTATCACCTCGATCAACTTCCTGGAAGAGAACGGCGCTT
ACAACGATGTCGATTACGTCTCATACGACGTGCTAGGGGACGTAGTATGCGGCGGCTTTGCGATGCCTAT
TCGCGAAAACAAGGCTCAGGAAATCTACATCGTCATGTCCGGTGAGATGATGGCGCTCTATGCCGCCAAC
AACATCGCGAAGGGTATCCTGAAGTACGCCCATGCGGGCGGCGTGCGGCTGGGGGGGTTGATTTGCAACG
AGCGCCAGACCGATCGGGAGCTCGACCTCGCCGAGGCACTTGCCGCCCGCCTCAATTCCAAGCTCATCCA
CTTCGTGCCGCGCGACAATATCGTTCAGCACGCAGAGCTCAGAAAGATGACAGTGATCCAATATGCGCCG
AACTCTAAGCAAGCCGGGGAATATCGCGCCCTGGCTGAAAAGATCCATGCAAATTCCGGCCGAGGCACCG
TCCCTACACCGATCACTATGGAGGAACTGGAGGACATGCTGCTCGACTTTGGAATCATGAAGAGCGACGA
GCAGATGCTTGCCGAACTCCACGCCAAGGAAGCCAAGGTAATAGCCCCCCACTG
ADD COMMENTlink written 2.7 years ago by genomax92k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1194 users visited in the last hour