I am trying to retrive the fasta files of contigs/scaffolds from ncbi using the Edirect (esearch): I have done the follwoing:
1. Install prerequisites: EDirect requires perl and ncbi-blast+ packages. Install them using:
sudo apt install perl ncbi-blast+ -y
2. Install EDirect
cd ~
wget https://ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/edirect.tar.gz
tar -xzf edirect.tar.gz
Now I had the eDirect scripts in my home directory including esearch, elink, efetch
I have been advised to also update the PATH to these files : I have run the following line from the directory that contains my bash script:
export PATH=~/edirect:$PATH
Then I fetched the nucleotide sequence using :
esearch -db biosample -query "SAMN32016926" | \ elink -target nuccore | \ efetch -format fasta > contigs.fasta
An error popped up saying:
Command ' elink' not found, did you mean:
command 'elink' from deb ncbi-entrez-direct (12.0.20190816+ds-1ubuntu0.2)
Try: sudo apt install <deb name>
Command ' efetch' not found, did you mean:
command 'efetch' from deb ncbi-entrez-direct (12.0.20190816+ds-1ubuntu0.2)
command 'efetch' from deb acedb-other (4.9.39+dfsg.02-4build1)
Try: sudo apt install <deb name>
curl: (77) error setting certificate verify locations: CAfile: /home/msamir/edirect/cacert.pem CApath: none
ERROR: curl command failed ( Tue 25 Jun 01:40:50 BST 2024 ) with: 77
-X POST https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi -d retmax=0&usehistory=y&db=biosample&term=SAMN32016926&tool=edirect&edirect=22.2&edirect_os=Linux&email=msamir%40Msamir
WARNING: FAILURE ( Tue 25 Jun 01:40:49 BST 2024 )
nquire -url https://eutils.ncbi.nlm.nih.gov/entrez/eutils/ esearch.fcgi -retmax 0 -usehistory y -db biosample -term SAMN32016926 -tool edirect -edirect 22.2 -edirect_os Linux -email msamir@Msamir
EMPTY RESULT
SECOND ATTEMPT
curl: (77) error setting certificate verify locations: CAfile: /home/msamir/edirect/cacert.pem CApath: none
ERROR: curl command failed ( Tue 25 Jun 01:40:52 BST 2024 ) with: 77
-X POST https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi -d retmax=0&usehistory=y&db=biosample&term=SAMN32016926&tool=edirect&edirect=22.2&edirect_os=Linux&email=msamir%40Msamir
WARNING: FAILURE ( Tue 25 Jun 01:40:51 BST 2024 )
nquire -url https://eutils.ncbi.nlm.nih.gov/entrez/eutils/ esearch.fcgi -retmax 0 -usehistory y -db biosample -term SAMN32016926 -tool edirect -edirect 22.2 -edirect_os Linux -email msamir@Msamir
EMPTY RESULT
LAST ATTEMPT
curl: (77) error setting certificate verify locations: CAfile: /home/msamir/edirect/cacert.pem CApath: none
ERROR: curl command failed ( Tue 25 Jun 01:40:54 BST 2024 ) with: 77
-X POST https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi -d retmax=0&usehistory=y&db=biosample&term=SAMN32016926&tool=edirect&edirect=22.2&edirect_os=Linux&email=msamir%40Msamir
ERROR: FAILURE ( Tue 25 Jun 01:40:53 BST 2024 )
nquire -url https://eutils.ncbi.nlm.nih.gov/entrez/eutils/ esearch.fcgi -retmax 0 -usehistory y -db biosample -term SAMN32016926 -tool edirect -edirect 22.2 -edirect_os Linux -email msamir@Msamir
EMPTY RESULT
QUERY FAILURE
curl: (77) error setting certificate verify locations: CAfile: /home/msamir/edirect/cacert.pem CApath: none
ERROR: curl command failed ( Tue 25 Jun 01:40:56 BST 2024 ) with: 77
-X POST https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi -d retmax=0&usehistory=y&db=biosample&term=SAMN32016926&tool=edirect&edirect=22.2&edirect_os=Linux&email=msamir%40Msamir
I am not sure what is the problem.... I have installed the "ncbi-entrez-direct" by running
sudo apt install ncbi-entrez-direct
but it did not change anything!
Any advice?
Thanks
You can't get sequence data from a biosample using Entrezdirect. There is no assembly associated with this sample.
You will need to download the SRA data and work on assembling it yourself.