Blast a CDD PSSM against a genome
Entering edit mode
14 months ago
rubic ▴ 240


I'm trying to search an NCBI conserved domain against a large genome.

I dowloaded NCBI's CDD PSSM files and indexed the genome both as a nucl dbtype as well as a prot dbtype. Now I'm trying to run psi-blast from the command line with one of the PSSM files (CHL00001.smp) against my indexed genome and I'm getting these warnings:

FastaReader: Hyphens are invalid and will be ignored around line 16147
FASTA-Reader: Ignoring invalid residues at position(s): On line 16147: 1, 3-18, 20-22, 25-26, 28-29
FASTA-Reader: Ignoring invalid residues at position(s): On line 16148: 1, 3-4, 6-8, 10, 12-13

And this happens even if I use deltablast, blastp and tblastn.

I'm assuming the PSSM file is not of the format the blast is accepts (though it seems weird since this PSSM file is from NCBI).

Any idea?

pssm CDD blast • 561 views
Entering edit mode
14 months ago
Mensur Dlakic ★ 19k

Difficulty to know exactly what you have tried without a specific command, but this might work for you.

First, I suggest you format your database as a nucleotide file that it appears to be:

makeblastdb -in your_db_file_name -dbtype nucl

Next, tblastn should be able to read those checkpoint/PSSM files as long as the first couple of lines of that file look like this:

PssmWithParameters ::= {
  pssm {
    isProtein TRUE,
    numRows 28,

The command:

tblastn -in_pssm CHL00001.smp -db your_db_file_name -evalue 1e-5 -out tblastn_results.txt

Login before adding your answer.

Traffic: 669 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6