Hi all...
Following the psipred installation instruction I have installed blast+ executables on G:\ drive along with the impala utility from legacy blast (makemat and copymat binary files) in bin folder and configured the environment variable to the path they were installed on then I have installed and untarred psipred on the same drive.
When I came across converting the runpsipred script into python to work on windows 7 machine I was stuck in understanding the underlying functionality of each linux command in the script ... I spent a lot of time trying to grasp the idea of certain commands but I had no luck.
I wrote the following code but is not yet completed as I'm having errors, but my main problem is that my code is not seeing psipred executables although it's there along with appropriate database required:
import os
import sys
print(os.getcwd())
os.chdir('G:\\psipred\\')
print(os.getcwd())
os.system('set dbname=`uniref90.fasta`')
os.system('set ncbidir=`G:\blast-2.7.1+\bin`')
os.system('set execdir=`G:\psipred\bin`')
os.system('set datadir=`G:\psipred\data`')
os.system('set basename=`test_seq`')
os.system('set rootname=`test_seq`')
os.system('set hostid=`hostid`')
print(os.system('set hostid =`hostid`'))
print(os.system('set tmproot=psitmp$$$hostid'))
os.system('copy -f test_seq.fasta $tmproot.fasta')
os.system('ncbidir/psiblast -b 0 -j 3 -h 0.001 -v 5000 -d dbname -i tmproot.fasta -C tmproot.chk tmproot.blast')
This gives the following errors:
C:\Users\Al-Hammad\Desktop\SQP-IRS
G:\psipred
0
0
The system cannot find the file specified.
'ncbidir' is not recognized as an internal or external command,
operable program or batch file.
but when I do this:
import os
import sys
os.system('blastdbcmd -db uniref90 -entry nm_000122 -outfmt "%f" -out test_query.txt')
os.system('blastn -query test_query.txt -db uniref90 -out output.txt')
print ("Done !!")
things works perfectly which means blast+ executables are there and working !!
Can you please give me hints on how to convert this linux commands into cmd for windows ?! I'm not familiar with linux at all and really need to get this working on my machine? and how can I direct my python script to see psipred bin and data folders globally without having to modify installation environment variable ?!
I would be so grateful for any help.
I'm no bioinformatics-on-windows pro, but basically your script makes a bunch of environment variables (which it then appears not to use), and makes some system calls, but the variable expansions are almost certainly wrong.
For example,
Uses DOS backslashes for the filepath, but when you make a system call via
os
, there is a forward slash:I think there are a number of errors here, but that's one of the major ones.
If
blastn
is available on the commandline without needing to specify a file path, as thisos.system('blastn -query test_query.txt -db uniref90 -out output.txt')
suggests, then you don't need thencbidir
variable at all AFAICT.Ok... I've tried performing psiblast on a relatively small database just to check if it's working as expected:
but it gives the following error:
Now it's NOT recognizing my files, What I'm doing wrong here?!
Does the file exist in the file system? Maybe the previous command generated an output with a different name/errored out?
Yes, the file is there in the path specified by the error message but it is not seen by the command for some reason!!
I've searched for the output that's supposed to be generated by the command but found nothing ... why on earth psipred is not for windows, that's really disappointing.
Maybe the file was not ready when the command executed? Or maybe the error message is wonky and the problem is that you need to index the file before you can use it?
I tried to provide the full path for my executables and database but still not working !!
I mean, step-2 creates files and step-3 uses those files. Are you sure step-2 is done processing and the output files are ready before step-3 is executed?
Are you running the commands whilst being in the same directory as them?
Yes all my files are on the same directory !!
You could declare the variable in python and then build the command as a string in python and use just one
os.system()
call, no? Or create a config file where you read the variable values from, so you don't need to change the script each time to execute it.In its current form, this script abstracts nothing and accomplishes just adding another programming paradigm (python) into the mix without leveraging any of what python has to offer.
Yes you're right, it's not making use of any python facilities yet ... I'm working on obtaining psipred results in order to be further plotted for specific purposes using some python modules. Sorry if that causes any inconvenience.
Why would it cause me any inconvenience? It's making your life difficult, that's all.
Can't you install the Windows Subsystem for Linux? Then you could follow the regular linux install instructions, without having to port the runpsipred script, and without having to leave windows.
That's a good solution though I must do a script to run psipred from a windows OS... why it's not supported, why ?!😭
Why? If someone insists, find out why they insist on this, because running bioinformatics on vanilla Windows is not something anyone that knows bioinformatics would insist on.
You are absolutely right regarding this ... I thought it would be an easy task to do so but apparently it seems impossible with the world heading towards open source OS. I guess I'm going to break the contract and surrender 👐
Simplistic explanation:
Because, historically, big servers run on UNIX and its variants, which are POSIX-compliant. Thus, to a certain extent, it is easy to port between them. Then Linux,which is also POSIX-compliant, came along and took over. Being free, it means it can be installed at no / very low costs on any computer, from laptops to servers to clusters. Which, in turn, makes it easy to develop and test new software on a small computer, and be quite sure it will run the same way on a big server. Which leads to lots of people developing on Linux and MacOsX (as MacOsX is POSIX-compliant).
Windows never wanted none of this, it wanted to be easy and different from other platforms, so as to lock in users. Which means, if you want a Windows PsiPred, you will have to bite the bullet and port it yourself.
it is now
macOS
:-)Thank you for the detailed explanation, that was really convincing ... I guess they succeeded in locking users -like me- in for nearly 10 years, but it's time to move to Linux and open source OS community as I LOVE Bioinformatics and pursuing my PhD in this interesting field. So, if I have to bite the bullet I must concentrate my efforts on conducting useful and meaningful researches instead of reinventing the wheel. Goodbye windows 👋
Welcome to the world of Linux. It is impressively powerful, but the cliche stands - with great power comes great responsibility.