Question: E-utilities for obtain gene sequences from the gene database
0
gravatar for dllopezr
20 months ago by
dllopezr40
dllopezr40 wrote:

Hi everyone

I need to download all gene sequences from a query gene in gene ncbi database through e-utilites in linux command line. The next command (adapted from ncbi example) works for gene to protein:

esearch -db gene -query "nifH AND Sinorhizobium meliloti 1021 [orgn]" | elink -target protein -name gene_protein_refseq | efetch -format fasta

But when I try this:

esearch -db gene -query "nifH AND Sinorhizobium meliloti 1021 [orgn]" | elink -target nuccore -name gene_nuccore_refseqgene | efetch -format fasta

I obtain this error msg:

QueryKey value not found in fetch input

As a note: ncbi examples of how to do this search don't exist, so I am question myself if it is possible to conect gene db with nucleotide db, of retrieve gene sequences from gene database as well

Thank you for your suggestions

nucleotide ncbi gene • 559 views
ADD COMMENTlink modified 20 months ago by genomax75k • written 20 months ago by dllopezr40
1
gravatar for genomax
20 months ago by
genomax75k
United States
genomax75k wrote:

Extending query you were trying:

esearch -db gene -query "nifH AND Sinorhizobium meliloti 1021 [orgn]" | efetch -format docsum | xtract -pattern GenomicInfoType -element ChrAccVer -element ChrStart -element ChrStop |xargs -n 3 sh -c 'efetch -db nuccore -id "$0" -seq_start "$1" -seq_stop "$2" -format fasta'

Should get you

>NC_003037.1:453555-454448 Sinorhizobium meliloti 1021 plasmid pSymA, complete sequence
GATGGCAGCTCTGCGTCAGATCGCGTTCTACGGTAAGGGGGGTATCGGCAAGTCCACGACCTCCCAAAAT
ACACTCGCCGCGCTTGTCGACCTGGGGCAAAAGATCCTTATTGTCGGCTGCGATCCGAAAGCGGACTCCA
CGCGCCTCATCCTGAACGCAAAGGCACAGGACACCGTACTGCATCTTGCGGCAACCGAAGGTTCGGTCGA
AGACCTCGAGCTCGAGGACGTGCTCAAAGTGGGTTACAGAGGCATCAAGTGCGTGGAGTCCGGTGGCCCA
GAGCCGGGCGTCGGCTGCGCCGGACGCGGCGTTATCACCTCGATCAACTTCCTGGAAGAGAACGGCGCTT
ACAACGATGTCGATTACGTCTCATACGACGTGCTAGGGGACGTAGTATGCGGCGGCTTTGCGATGCCTAT
TCGCGAAAACAAGGCTCAGGAAATCTACATCGTCATGTCCGGTGAGATGATGGCGCTCTATGCCGCCAAC
AACATCGCGAAGGGTATCCTGAAGTACGCCCATGCGGGCGGCGTGCGGCTGGGGGGGTTGATTTGCAACG
AGCGCCAGACCGATCGGGAGCTCGACCTCGCCGAGGCACTTGCCGCCCGCCTCAATTCCAAGCTCATCCA
CTTCGTGCCGCGCGACAATATCGTTCAGCACGCAGAGCTCAGAAAGATGACAGTGATCCAATATGCGCCG
AACTCTAAGCAAGCCGGGGAATATCGCGCCCTGGCTGAAAAGATCCATGCAAATTCCGGCCGAGGCACCG
TCCCTACACCGATCACTATGGAGGAACTGGAGGACATGCTGCTCGACTTTGGAATCATGAAGAGCGACGA
GCAGATGCTTGCCGAACTCCACGCCAAGGAAGCCAAGGTAATAGCCCCCCACTG
ADD COMMENTlink written 20 months ago by genomax75k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1592 users visited in the last hour