Question: PSIPRED, how to install and make it work?
0
gravatar for ac.research
18 months ago by
ac.research10
ac.research10 wrote:

Dear All,

I am trying to setup PSIPRED to run a secondary structure prediction but the README.md file is not detailed enough for a beginner like me to setup and run PSIPRED. I am trying to use it with BLAST+.

I am on a Linux Ubuntu 17.04 operating system, and the following are the steps I take to try and setup PSIPRED. There seems to be missing programs that I am not sure how to install or find where they are as well as missing commands:

sudo apt install ncbi-blast+                                                        #OK:        Install BLAST+
wget http://bioinfadmin.cs.ucl.ac.uk/downloads/psipred/psipred.4.01.tar.gz          #OK:        Get the latest PSIPRED
tar xzvf psipred.4.01.tar.gz                                                        #OK:        Extract the latest PSIPRED
rm psipred.4.01.tar.gz                                                              #OK:        Remove the the latest PSIPRED .tar.gz download
cd psipred/src                                                                      #OK:        Go to source code directory
make                                                                                #OK:        Compile PSIPRED binaries
make install                                                                        #OK:        Compile PSIPRED binaries
cd ..                                                                               #OK:        Go back to PSIPRED top level directory
wget ftp://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref90/uniref90.fasta.gz  #OK:        Download the latest uniref90 database
gunzip -v uniref90.fasta.gz                                                         #OK:        Uncompress the latest uniref90 database
bin/pfilt uniref90.fasta > uniref90filt                                             #ERROR HERE:    pfilt not found. Cannot find where the pfilt program is and cannot find from where to install it from
formatdb -t uniref90filt -i uniref90filt                                            #ERROR HERE:    formatdb not found. Cannot find where the formatdb program is and cannot find from where to install it from
makeblastdb -dbtype prot -in uniref90filt -out uniref90filt                         #OK:        Not sure what this does but it works fine
./BLAST+/runpsipredplus example/example.fasta                                       #ERROR HERE:    /usr/local/bin/psiblast: Command not found. FATAL: Error whilst running blastpgp - script terminated!

How can I get PSIPRED to work? what I am doing wrong? Where can I get all the necessary external program to make PSIPRED work? What am I missing?

blast psipred software error • 1.6k views
ADD COMMENTlink modified 18 months ago by fishgolden360 • written 18 months ago by ac.research10
1

duplicate of Trying To Run Psipred But Failing To Use Blastpgp

ADD REPLYlink written 18 months ago by Pierre Lindenbaum118k

I followed these instructions, but they are 4 years old and it seems there has been major changes to the PSIPRED program as well as others. Hence the steps in that forum entry do not work anymore.

ADD REPLYlink written 18 months ago by ac.research10

As a side note: installing troubleshootings that depend on dependencies and standard errors are better solved in Stack Overflow! Here you might find the help you need but this forum is about problems with data interpretation and program usage :)

ADD REPLYlink written 18 months ago by Macspider2.8k
1
gravatar for fishgolden
18 months ago by
fishgolden360
fishgolden360 wrote:

pfilt is a program to mask low complexity regions(and other regions which disturbs psi-blast search) in protein sequences. I think psipred2 or 3 had it but not found in 4 I downloaded now.

I found sentences in README as follows

"As of PSIPRED V4.0 onwards, we no longer believe it is necessary for the sequence data banks used with PSI-BLAST to be filtered to remove low-complexity regions, transmembrane regions, and coiled-coil segments. The search data bank can therefore be any large non-redundant protein sequence data bank, with UNIREF90 (http://www.uniprot.org/help/uniref) being the recommended one."

Thus I think pfilt step is not needed.

ADD COMMENTlink modified 18 months ago • written 18 months ago by fishgolden360

Thank you for your replay.

I think the problem I am facing is with the database setup, these are the steps I follow to setup the UniProt90 database (slightly changed from the top and without using pfilt as you advised):

wget ftp://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref90/uniref90.fasta.gz
gunzip -v uniref90.fasta.gz
makeblastdb -in uniref90.fasta -dbtype prot -out uniref90.fasta -hash_index

But the setup does not complete and I get the following error:

Building a new DB, current time: 09/08/2017 10:54:39
New DB name:   /home/acresearch/Desktop/db/uniref90.fasta
New DB title:  uniref90.fasta
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B

volume: /home/acresearch/Desktop/db/uniref90.fasta.00
volume: /home/acresearch/Desktop/db/uniref90.fasta.01
volume: /home/acresearch/Desktop/db/uniref90.fasta.02

file: /home/acresearch/Desktop/db/uniref90.fasta.00.pin
file: /home/acresearch/Desktop/db/uniref90.fasta.00.phr
file: /home/acresearch/Desktop/db/uniref90.fasta.00.psq
file: /home/acresearch/Desktop/db/uniref90.fasta.00.psi
file: /home/acresearch/Desktop/db/uniref90.fasta.00.psd
file: /home/acresearch/Desktop/db/uniref90.fasta.00.phi
file: /home/acresearch/Desktop/db/uniref90.fasta.00.phd
file: /home/acresearch/Desktop/db/uniref90.fasta.00.pog
file: /home/acresearch/Desktop/db/uniref90.fasta.01.pin
file: /home/acresearch/Desktop/db/uniref90.fasta.01.phr
file: /home/acresearch/Desktop/db/uniref90.fasta.01.psq
file: /home/acresearch/Desktop/db/uniref90.fasta.01.psi
file: /home/acresearch/Desktop/db/uniref90.fasta.01.psd
file: /home/acresearch/Desktop/db/uniref90.fasta.01.phi
file: /home/acresearch/Desktop/db/uniref90.fasta.01.phd
file: /home/acresearch/Desktop/db/uniref90.fasta.01.pog
file: /home/acresearch/Desktop/db/uniref90.fasta.02.pin
file: /home/acresearch/Desktop/db/uniref90.fasta.02.phr
file: /home/acresearch/Desktop/db/uniref90.fasta.02.psq
file: /home/acresearch/Desktop/db/uniref90.fasta.02.psi
file: /home/acresearch/Desktop/db/uniref90.fasta.02.psd
file: /home/acresearch/Desktop/db/uniref90.fasta.02.phi
file: /home/acresearch/Desktop/db/uniref90.fasta.02.phd
file: /home/acresearch/Desktop/db/uniref90.fasta.02.pog
file: /home/acresearch/Desktop/db/uniref90.fasta.pal

BLAST Database creation error: Error: Duplicate seq_ids are found: 
GNL|BL_ORD_ID:2267708

I get this same error whether I use pfilt or I don't. I am not sure what to do in this situation.

ADD REPLYlink written 18 months ago by ac.research10

I changed the commands to the following:

wget ftp://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref90/uniref90.fasta.gz
gunzip -v uniref90.fasta.gz
makeblastdb -in uniref90.fasta -dbtype prot -input_type fasta -out uniref90.fasta

And the database seems to setup ok (no errors printed). But then when I come to test whether I can use it or not with the following command:

blastp -db ./uniref90.fasta -query ./structure.fasta

I get the following error:

BLAST Database error: No alias or index file found for protein database [./uniref90.fasta] in search path [/home/acresearch/Desktop/db::]

I am testing the database with blastp because I know it is installed correctly through sudo apt install ncbi-blast+ (therefore the problem is not with the PSIPRED download and compilation rather with the database setup itself - correct me if I am wrong). I do not understand why blastp and PSIPRED cannot find the database even though I am pointing them to the correct path (/home/acresearch/Desktop/db/uniref90.fasta) and I am running the command from within the directory they were setup in (hence I use ./uniref90.fasta). I have been searching on many forums about this particular problem, and they all seems to use my same command setup, but it works with them and not with me, I am not sure what I am doing wrong.

ADD REPLYlink written 18 months ago by ac.research10

I couldn't find problems in your commands...Thus I have no idea about what is wrong. Possibly typo? Check

ls -l /home/acresearch/Desktop/db/uniref90.fasta.pal

ADD REPLYlink modified 18 months ago • written 18 months ago by fishgolden360

It works now, I had an issue with the makeblastdb command (a typo as you said).

Thank you very much.

ADD REPLYlink written 18 months ago by ac.research10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 829 users visited in the last hour