Running Psiblast To Generate Pssms From A Large Number Of Protein Sequences
Entering edit mode
7.7 years ago
cjb60 • 0

Hi there,

I'm working on a project that requires me to generate PSSMs from a very large number of protein sequences (all of these are in .fasta format), however this takes an infeasibly long amount of time with the way I'm doing things at the moment. In general, I'm using this command:

psiblast -db nr -query SOME_PROTEIN_SEQUENCE.fasta -out_ascii_pssm PSSM.txt

And just recently I tried to see if I could speed things up a bit by adding these parameters:

-num_threads 4 -word_size 5

Is there any way to speed up the process in other ways? Or am I out of luck? I'm only interested in generating the PSSMs from these protein sequences and nothing else.

Thanks in advance.

pssm • 2.2k views
Entering edit mode
7.7 years ago
Pappu ★ 2.0k

You can align the sequences and calculate the PSSM by a python script. Take a look at Biopython.


Login before adding your answer.

Traffic: 1451 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6