Running Psiblast To Generate Pssms From A Large Number Of Protein Sequences
1
0
Entering edit mode
7.7 years ago
cjb60 • 0

Hi there,

I'm working on a project that requires me to generate PSSMs from a very large number of protein sequences (all of these are in .fasta format), however this takes an infeasibly long amount of time with the way I'm doing things at the moment. In general, I'm using this command:

psiblast -db nr -query SOME_PROTEIN_SEQUENCE.fasta -out_ascii_pssm PSSM.txt

And just recently I tried to see if I could speed things up a bit by adding these parameters:

-num_threads 4 -word_size 5

Is there any way to speed up the process in other ways? Or am I out of luck? I'm only interested in generating the PSSMs from these protein sequences and nothing else.

Thanks in advance.

pssm • 2.2k views
ADD COMMENT
0
Entering edit mode
7.7 years ago
Pappu ★ 2.0k

You can align the sequences and calculate the PSSM by a python script. Take a look at Biopython.

ADD COMMENT

Login before adding your answer.

Traffic: 1451 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6