A trouble in using psiblast, blastdbcmd outputs empty file. why?
0
0
Entering edit mode
5.3 years ago
ygxing1 • 0

Hi, every one!

I use psiblast to search the homologous protein sequences. Then use blastdbcmd to extract the exact homologous sequences based on the output file of psiblast. BUT I met a weird trouble as follows.

================ Below will get one seq output ===========

blastdbcmd -entry 99 -db ./db/test -dbtype prot -outfmt %f -out test.hits.fa

================= Below will get empty output ================

blastdbcmd -entry_batch fmt4 -db ./db/test -dbtype prot -outfmt %f -out test.hits.2.fa

======================================================

The file "fmt4" is like this :

===============================

PSIBLAST 2.7.1+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Stephen F.
Altschul, John C. Wootton, E. Michael Gertz, Richa Agarwala,
Aleksandr Morgulis, Alejandro A. Schaffer, and Yi-Kuo Yu (2005)
"Protein database searches using compositionally adjusted
substitution matrices", FEBS J. 272:5101-5109.
Reference for composition-based statistics starting in round 2:
Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST
protein database searches with composition-based statistics and
other refinements", Nucleic Acids Res. 29:2994-3005.

Database: db
           3 sequences; 498 total letters
Results from round 1
Query= T0658-D1
Length=166
                                                                      Score     E

Sequences producing significant alignments:                          (Bits)  Value
2  unnamed protein product                                            342     5e-127
1  unnamed protein product                                            342     5e-127
0  unnamed protein product                                            342     5e-127

Query_1 1 ADSIYVREQQIPILIDRIDNVLYEMRIPAQKGDVLNEITIQIGDNV  166

2     1   ADSIYVREQQIPILIDRIDNVLYEMRIPAQKGDVLNEITIQI  166

1     1   ADSIYVREQQIPILIDRIDNVLYEMRIPAQKGDVLNEITIQIGDNVDLSDI  166

0     1   ADSIYVREQQIPILIDRIDNVLYEMRIPAQKGDVLNEITIQI  166

Lambda      K        H        a         alpha
   0.316    0.133    0.379    0.792     4.96

=================================================

I am lost here. Could you give me some help? Great thanks in advance!

software error • 1.0k views
ADD COMMENT
0
Entering edit mode

Does fmt4 file have one input entry gi/Accession # per line?

ADD REPLY
0
Entering edit mode

hi! Yes! there is. But I do not know how to post them clearly. Very sorry.

ADD REPLY
0
Entering edit mode

It looks like your fmt4 file has actual blast results. You can use only accession numbers (One per line) with -entry_batch directive.

ADD REPLY
0
Entering edit mode

Yes, this is a psiblast output file with "outfmt" equals 4. Do you know how to extract all the accession numbers from the output file?

There is a weird problem. I can extract the hits from some other fmt4 files. But I fail in some fmt4 files also.

comparing these fmt4 files carefully, I did not find any useful clues.

ADD REPLY
0
Entering edit mode

You should have used a tabular format for the results. It looks like this is a custom database of some sort, correct? It may be easier to re-run the blast otherwise you are going to need to parse this file.

ADD REPLY
0
Entering edit mode

Thank you very much! Yes, this is a custom database.

After one night enough sleep, I get your point.

ADD REPLY

Login before adding your answer.

Traffic: 2658 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6