Using NCBIs EDirect to download mammalian refseq_rna database
2
0
Entering edit mode
8.1 years ago
age00 • 0

I want to download the mammalian refseq_rna database using NCBI's EDirect on the command line in order to do a blastn query for 150,000 sequences (evidently too large for -remote)

This is what I have done:

esearch -db refseq_rna -query "mammalia [ORGN]" | efetch -format fasta  > mammalrefseq_rna.fsa

But I get this error:

500 Can't connect to eutils.ncbi.nlm.nih.gov:443 (connect: Network is unreachable)
No do_post output returned from 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=refseq_rna&term=mammalia%20%5BORGN%5D&                                                                                            retmax=0&usehistory=y&edirect=7.20&tool=edirect&email=age87@lg-1r17-n04.guillimin.clumeq.ca'
Result of do_post http request is
$VAR1 = bless( {
                 '_content' => '500 Can\'t connect to eutils.ncbi.nlm.nih.gov:443 (connect: Network is unreachable)
',
                 '_rc' => 500,
                 '_headers' => bless( {
                                        'client-warning' => 'Internal response',
                                        'client-date' => 'Tue, 29 Aug 2017 18:41:05 GMT',
                                        'content-type' => 'text/plain'
                                      }, 'HTTP::Headers' ),
                 '_msg' => 'Can\'t connect to eutils.ncbi.nlm.nih.gov:443 (connect: Network is unreachable)',
                 '_request' => bless( {
                                        '_content' => 'db=refseq_rna&term=mammalia%20%5BORGN%5D&retmax=0&usehistory=y&edirect=7.20&to                                                                                            ol=edirect&email=age87@lg-1r17-n04.guillimin.clumeq.ca',
                                        '_uri' => bless( do{\(my $o = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi')},                                                                                             'URI::https' ),
                                        '_headers' => bless( {
                                                               'user-agent' => 'libwww-perl/5.833',
                                                               'content-type' => 'application/x-www-form-urlencoded'
                                                             }, 'HTTP::Headers' ),
                                        '_method' => 'POST'
                                      }, 'HTTP::Request' )
               }, 'HTTP::Response' );

WebEnv value not found in search output - WebEnv1

Any help is greatly appreciated!

blast genome • 3.7k views
ADD COMMENT
1
Entering edit mode
8.1 years ago

Your command fails because of network error so it never gets to submitting it.

In addition, the refseq_rna is not a valid database to esearch so it would not work even if the network error did not present itself. See:

einfo --dbs

for valid databases.

I would recommend to look at RNA Central and get data from there

http://rnacentral.org/

ADD COMMENT
0
Entering edit mode

In case you didn't have a network problem, this would be the error message:

ERROR in search output: Invalid db name specified: refseq_rna URL:
db=refseq_rna&term=mammalia%20%5BORGN%5D&retmax=0&usehistory=y

ERROR in fetch input: Invalid db name specified: refseq_rna
ADD REPLY
1
Entering edit mode
8.1 years ago
GenoMax 154k

Following seems to be working.

esearch -db nuccore -query "mammalia [ORGN] AND srcdb refseq validated [PROP]" | efetch -format fasta > mammalian.fa
ADD COMMENT

Login before adding your answer.

Traffic: 3428 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6