PSIPRED, how to install and make it work?
1
0
Entering edit mode
3.8 years ago
ac.research ▴ 10

Dear All,

I am trying to setup PSIPRED to run a secondary structure prediction but the README.md file is not detailed enough for a beginner like me to setup and run PSIPRED. I am trying to use it with BLAST+.

I am on a Linux Ubuntu 17.04 operating system, and the following are the steps I take to try and setup PSIPRED. There seems to be missing programs that I am not sure how to install or find where they are as well as missing commands:

sudo apt install ncbi-blast+                                                        #OK:        Install BLAST+
tar xzvf psipred.4.01.tar.gz                                                        #OK:        Extract the latest PSIPRED
cd psipred/src                                                                      #OK:        Go to source code directory
make                                                                                #OK:        Compile PSIPRED binaries
make install                                                                        #OK:        Compile PSIPRED binaries
cd ..                                                                               #OK:        Go back to PSIPRED top level directory
gunzip -v uniref90.fasta.gz                                                         #OK:        Uncompress the latest uniref90 database
bin/pfilt uniref90.fasta > uniref90filt                                             #ERROR HERE:    pfilt not found. Cannot find where the pfilt program is and cannot find from where to install it from
formatdb -t uniref90filt -i uniref90filt                                            #ERROR HERE:    formatdb not found. Cannot find where the formatdb program is and cannot find from where to install it from
makeblastdb -dbtype prot -in uniref90filt -out uniref90filt                         #OK:        Not sure what this does but it works fine
./BLAST+/runpsipredplus example/example.fasta                                       #ERROR HERE:    /usr/local/bin/psiblast: Command not found. FATAL: Error whilst running blastpgp - script terminated!


How can I get PSIPRED to work? what I am doing wrong? Where can I get all the necessary external program to make PSIPRED work? What am I missing?

psipred blast software error • 4.5k views
1
Entering edit mode
0
Entering edit mode

I followed these instructions, but they are 4 years old and it seems there has been major changes to the PSIPRED program as well as others. Hence the steps in that forum entry do not work anymore.

0
Entering edit mode

As a side note: installing troubleshootings that depend on dependencies and standard errors are better solved in Stack Overflow! Here you might find the help you need but this forum is about problems with data interpretation and program usage :)

1
Entering edit mode
3.8 years ago
fishgolden ▴ 450

I found sentences in README as follows

"As of PSIPRED V4.0 onwards, we no longer believe it is necessary for the sequence data banks used with PSI-BLAST to be filtered to remove low-complexity regions, transmembrane regions, and coiled-coil segments. The search data bank can therefore be any large non-redundant protein sequence data bank, with UNIREF90 (http://www.uniprot.org/help/uniref) being the recommended one."

Thus I think pfilt step is not needed.

0
Entering edit mode

I think the problem I am facing is with the database setup, these are the steps I follow to setup the UniProt90 database (slightly changed from the top and without using pfilt as you advised):

wget ftp://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref90/uniref90.fasta.gz
gunzip -v uniref90.fasta.gz
makeblastdb -in uniref90.fasta -dbtype prot -out uniref90.fasta -hash_index


But the setup does not complete and I get the following error:

Building a new DB, current time: 09/08/2017 10:54:39
New DB name:   /home/acresearch/Desktop/db/uniref90.fasta
New DB title:  uniref90.fasta
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B

volume: /home/acresearch/Desktop/db/uniref90.fasta.00
volume: /home/acresearch/Desktop/db/uniref90.fasta.01
volume: /home/acresearch/Desktop/db/uniref90.fasta.02

file: /home/acresearch/Desktop/db/uniref90.fasta.00.pin
file: /home/acresearch/Desktop/db/uniref90.fasta.00.phr
file: /home/acresearch/Desktop/db/uniref90.fasta.00.psq
file: /home/acresearch/Desktop/db/uniref90.fasta.00.psi
file: /home/acresearch/Desktop/db/uniref90.fasta.00.psd
file: /home/acresearch/Desktop/db/uniref90.fasta.00.phi
file: /home/acresearch/Desktop/db/uniref90.fasta.00.phd
file: /home/acresearch/Desktop/db/uniref90.fasta.00.pog
file: /home/acresearch/Desktop/db/uniref90.fasta.01.pin
file: /home/acresearch/Desktop/db/uniref90.fasta.01.phr
file: /home/acresearch/Desktop/db/uniref90.fasta.01.psq
file: /home/acresearch/Desktop/db/uniref90.fasta.01.psi
file: /home/acresearch/Desktop/db/uniref90.fasta.01.psd
file: /home/acresearch/Desktop/db/uniref90.fasta.01.phi
file: /home/acresearch/Desktop/db/uniref90.fasta.01.phd
file: /home/acresearch/Desktop/db/uniref90.fasta.01.pog
file: /home/acresearch/Desktop/db/uniref90.fasta.02.pin
file: /home/acresearch/Desktop/db/uniref90.fasta.02.phr
file: /home/acresearch/Desktop/db/uniref90.fasta.02.psq
file: /home/acresearch/Desktop/db/uniref90.fasta.02.psi
file: /home/acresearch/Desktop/db/uniref90.fasta.02.psd
file: /home/acresearch/Desktop/db/uniref90.fasta.02.phi
file: /home/acresearch/Desktop/db/uniref90.fasta.02.phd
file: /home/acresearch/Desktop/db/uniref90.fasta.02.pog
file: /home/acresearch/Desktop/db/uniref90.fasta.pal

BLAST Database creation error: Error: Duplicate seq_ids are found:
GNL|BL_ORD_ID:2267708


I get this same error whether I use pfilt or I don't. I am not sure what to do in this situation.

0
Entering edit mode

I changed the commands to the following:

wget ftp://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref90/uniref90.fasta.gz
gunzip -v uniref90.fasta.gz
makeblastdb -in uniref90.fasta -dbtype prot -input_type fasta -out uniref90.fasta


And the database seems to setup ok (no errors printed). But then when I come to test whether I can use it or not with the following command:

blastp -db ./uniref90.fasta -query ./structure.fasta


I get the following error:

BLAST Database error: No alias or index file found for protein database [./uniref90.fasta] in search path [/home/acresearch/Desktop/db::]


I am testing the database with blastp because I know it is installed correctly through sudo apt install ncbi-blast+ (therefore the problem is not with the PSIPRED download and compilation rather with the database setup itself - correct me if I am wrong). I do not understand why blastp and PSIPRED cannot find the database even though I am pointing them to the correct path (/home/acresearch/Desktop/db/uniref90.fasta) and I am running the command from within the directory they were setup in (hence I use ./uniref90.fasta). I have been searching on many forums about this particular problem, and they all seems to use my same command setup, but it works with them and not with me, I am not sure what I am doing wrong.

0
Entering edit mode

I couldn't find problems in your commands...Thus I have no idea about what is wrong. Possibly typo? Check

ls -l /home/acresearch/Desktop/db/uniref90.fasta.pal

0
Entering edit mode

It works now, I had an issue with the makeblastdb command (a typo as you said).

Thank you very much.

0
Entering edit mode

Hello :) may I know the commands you ended up using to run psipred??