blastp Error: NCBI C++ Exception: ncbi::CObject::ThrowNullPointerException() - Attempt to access NULL pointer
0
0
Entering edit mode
2.5 years ago
nattzy94 ▴ 50

Hi,

I am trying to blast a fasta file of protein sequences against the non-redundant database on a HPC. I run the following command:

cat prot/split_fasta/master.dataframe.tide-tandem.protein.part_001.fa | parallel --GNU --block 100k --recstart '>' --pipe '/home/users/nus/e0470749/ncbi-blast-2.8.1+/bin/blastp -query - -db nr -outfmt "6 std slen qlen stitle staxids sscinames" -max_target_seqs 500 -num_threads 12 -evalue 0.001' > seps_nr_out_001.txt

However, the job gets terminated with Exit status: 1. I thought that this was a memory issue based on previous posts with the same error. Hence, I tried to break my original FASTA file (10,000 sequence) into smaller parts. The current file contains ~ 100 sequences now. I also run the job with 1 TB of memory which seems to be sufficient based on the usage report:

Resource Usage on 2021-11-08 11:52:18.892810:

    JobId: 6845745.wlm01
    Project: personal
    Exit Status: 1
    NCPUs Requested: 12                             NCPUs Used: 12
                                                    CPU Time Used: 11:50:04
    Memory Requested: 1tb                           Memory Used: 159785036kb
                                                    Vmem Used: 266577592kb
    Walltime requested: 12:00:00                    Walltime Used: 01:39:10

    Execution Nodes Used: (lmn2609:mem=1073741824kb:ncpus=12)

The Blast database also seems to be normal. Running ~/ncbi-blast-2.8.1+/bin/blastdbcmd -info -db blastdb/nr gives:

Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 436,338,278 sequences; 161,860,501,762 total residues

Is there any thing else I can try to solve this? The only other thing I can think of is to downgrade BLAST.

blast • 2.5k views
ADD COMMENT
1
Entering edit mode

You are using an older version of blast+ which may be incompatible with current nr (I assume you downloaded the pre0formatted indexes which are now v.5). You can update your blast package to latest and see if the helps.

Can you show us what your fasta headers look like?

ADD REPLY
0
Entering edit mode

Thanks for your help. Yes, I am using the pre-formatted nr database (v. 5). I am using a slightly older Blast+ (v. 2.8.1) as the hpc server I am working on has an outdated GLIBC. When I use the latest blast by running ./ncbi-blast-2.12.0+/bin/blast+, I get this error:

./ncbi-blast-2.12.0+/bin/blastp: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by ./ncbi-blast-2.12.0+/bin/blastp)

I will try using an older nr database (v. 4) in this case.

These are the first few lines of my FASTA file:

>ENST00000449034_(+2)_26
KMTEPGFPLQPRRCLGQQKEQE
>ENST00000400429_(+3)_5
WNYNNKESGTVMRSLGQNPTEAELQDMINEVDADGNGTVDFPEFLTMMARKMKDTDSEEE
IRDAFCVFDKDGNGYISATELHHVMTNLGENLTDDEVDEMIR
>ENST00000405486_(+1)_31
GAAYAIALDRTLATGRAGLCPMCPVSPLSMCVGVAHVQVCASCRDLGFNVFCWPSPALLW
GVGPQGKGL
>ENST00000570769_(+3)_76
DPVSKIKILLRLHCGGGGKSMDFDFLFAVFYFW
ADD REPLY
0
Entering edit mode

I will try using an older nr database (v. 4) in this case.

Unless you create a new version of v.4 indexes yourself nr old database version that you can download from NCBI is frozen as of Feb 2020. Keep that in mind.

You could try a small subset of the fasta you have and see if you get that error. If you do then you may want to remove (+1) etc from the fasta headers and see if that eliminates the error.

ADD REPLY

Login before adding your answer.

Traffic: 1950 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6