efetch problem on ARM Mac
1
0
Entering edit mode
2.2 years ago
frabomba6 ▴ 40

Hi, I'm currently studying on the Biostar Handbook and encountring a problem with the command efetch in the bioinfo conda environment.

If I run the command:

esearch -db protein -query PRJNA257197 | efetch -format fasta > ref/prots_2014.fa

I get various errors like:

curl: (22) The requested URL returned error: 400
 ERROR:  curl command failed ( Fri Apr  7 16:01:44 CEST 2023 ) with: 22
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi -d db=protein&id=Unable%2Cto%2Clocate%2Cxtract%2Cexecutable.%2CPlease%2Cexecute%2Cthe%2Cfollowing%2Cnquire%2Cdwn%2Cftp.ncbi.nlm.nih.gov%2Centrez%2Centrezdirect%2Cxtract.Silicon.gz%2Cgunzip%2Cf%2Cxtract.Silicon.gz%2Cchmod%2Cx%2Cxtract.Silicon&rettype=fasta&retmode=text&tool=edirect&edirect=16.2&edirect_os=Darwin&email=bomba%40MBPfrabomba6.local
HTTP/1.1 400 Bad Request

What I've done so far to correct this behaviour:

  • upgrade/update the environment
  • removed and reinstalled entrez-direct

I've also installed entrez-direct outside of conda using the command provided in the official instructions and in this case I don't get any error and it works flawlessly.

efetch entrez-direct conda • 2.4k views
ADD COMMENT
2
Entering edit mode

You need the xtract executable, like the error mentions. What does which xtract give you within the conda env?

ADD REPLY
0
Entering edit mode
~/miniconda3/envs/bioinfo/bin/xtract

But your comment give me an idea about the problem: it seems that I'm missing a specific version of xtract that is xtract.Silicon (which should be the version specific for ARM Mac)

ADD REPLY
4
Entering edit mode
2.2 years ago
frabomba6 ▴ 40

I'm posting an aswer as final solution for my problem.

Description of the problem

After Ram comment I investigate on xtract and in my conda env bin folder ~/miniconda3/envs/bioinfo/bin/ I could observe that xtract was present in two "versions": xtract and xtract.Darwin.

In a non-ARM Mac the first command should point to the second.

Running xtract on an ARM Mac makes the command search for xtract.Silicon (codename for an ARM Mac) which seems to not be installed with the whole environment.

Fix

To fix this problem I performed the following commands:

# Move to bioinfo bin folder
cd ~/miniconda3/envs/bioinfo/bin/

# Download xtract.Silicon
nquire -dwn ftp.ncbi.nlm.nih.gov entrez/entrezdirect xtract.Silicon.gz

# Extract the executable
gunzip -f xtract.Silicon.gz

# Make it executable
chmod +x xtract.Silicon

This download and install xtract.Silicon which is already pointed by the default command xtract

ADD COMMENT
1
Entering edit mode

Thank you for adding your own answer - this will be super helpful to users that switch to ARM machines. It may also help if you look into the conda recipe to see if any changes need to be made - and alert the maintainer.

Also, on a personal note, TIL about nquire, so thank you for that

ADD REPLY
0
Entering edit mode

Ram I could spot a problem in the script http://data.biostarhandbook.com/install.sh

Can you point to me the maintainer or the right place to signal the problem?

ADD REPLY
1
Entering edit mode

The Biostar Handbook tracks issues here: https://github.com/biostars/biostar-handbook/issues

I looked into bioconda::entrez-direct but unfortunately there is no maintainer listed there, so it might be worth searching github for the right repository

ADD REPLY
0
Entering edit mode

The problem is not in entrez-direct but in the fact that miniconda3 is always installed in the x86_64 architecture version on Mac (ARM != x86_64).

ARM Macs are capable of running this through a piece of software called Rosetta 2 but then all the libraries are installed for the same architecture and this causes the wrong version of xtract to be installed.

I'm opening an issue on GitHub so hopefully they are gonna fix it soon.

Sorry for the long comment but I wanted to be complete

ADD REPLY
1
Entering edit mode

Yeah, the install script does not check for ARM architecture but this is not a problem for most software so changing the architecture for the miniconda being installed might not be the solution. This needs to be addressed at the recipe level.

I dug deeper into the conda recipe and it looks like the recipe uses the repo https://github.com/biostars/conda-ready-entrez-direct so that might be the place to open an issue. However, it looks like prepare.sh and install.sh already address this issue - prepare downloads all xtracts and install uses uname -m and uname -s to pick the right xtract. It does not depend on miniconda's platform at all.

What version of entrez-direct are you installing and from which channel?

ADD REPLY
1
Entering edit mode

entrez-direct was installed using install.sh provided with Biostar Handbook that at some point runs the ftp://ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/install-edirect.sh script that, as you said, should check for both OS and architecture of the machine.

But, on Mac, if for some reason a line of the script is causing the bash to enter in emulation mode then the uname -m will return x86_64. In the image I manually triggered this behaviour.

In any case I'm opening an issue on https://github.com/biostars/biostar-handbook/issues since by adding few lines of code to the script would take in consideration also ARM architectures

Emulation

ADD REPLY
0
Entering edit mode

Update

The previous comments are outdated since the problem revealed to be another.

More info here https://github.com/biostars/biostar-handbook/issues/260

ADD REPLY

Login before adding your answer.

Traffic: 2991 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6