Entering edit mode
5.3 years ago
ygxing1
•
0
Hi, every one!
I use psiblast to search the homologous protein sequences. Then use blastdbcmd to extract the exact homologous sequences based on the output file of psiblast. BUT I met a weird trouble as follows.
================ Below will get one seq output ===========
blastdbcmd -entry 99 -db ./db/test -dbtype prot -outfmt %f -out test.hits.fa
================= Below will get empty output ================
blastdbcmd -entry_batch fmt4 -db ./db/test -dbtype prot -outfmt %f -out test.hits.2.fa
======================================================
The file "fmt4" is like this :
===============================
PSIBLAST 2.7.1+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Stephen F.
Altschul, John C. Wootton, E. Michael Gertz, Richa Agarwala,
Aleksandr Morgulis, Alejandro A. Schaffer, and Yi-Kuo Yu (2005)
"Protein database searches using compositionally adjusted
substitution matrices", FEBS J. 272:5101-5109.
Reference for composition-based statistics starting in round 2:
Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST
protein database searches with composition-based statistics and
other refinements", Nucleic Acids Res. 29:2994-3005.
Database: db
3 sequences; 498 total letters
Results from round 1
Query= T0658-D1
Length=166
Score E
Sequences producing significant alignments: (Bits) Value
2 unnamed protein product 342 5e-127
1 unnamed protein product 342 5e-127
0 unnamed protein product 342 5e-127
Query_1 1 ADSIYVREQQIPILIDRIDNVLYEMRIPAQKGDVLNEITIQIGDNV 166
2 1 ADSIYVREQQIPILIDRIDNVLYEMRIPAQKGDVLNEITIQI 166
1 1 ADSIYVREQQIPILIDRIDNVLYEMRIPAQKGDVLNEITIQIGDNVDLSDI 166
0 1 ADSIYVREQQIPILIDRIDNVLYEMRIPAQKGDVLNEITIQI 166
Lambda K H a alpha
0.316 0.133 0.379 0.792 4.96
=================================================
I am lost here. Could you give me some help? Great thanks in advance!
Does
fmt4
file have one input entry gi/Accession # per line?hi! Yes! there is. But I do not know how to post them clearly. Very sorry.
It looks like your
fmt4
file has actual blast results. You can use only accession numbers (One per line) with-entry_batch
directive.Yes, this is a psiblast output file with "outfmt" equals 4. Do you know how to extract all the accession numbers from the output file?
There is a weird problem. I can extract the hits from some other fmt4 files. But I fail in some fmt4 files also.
comparing these fmt4 files carefully, I did not find any useful clues.
You should have used a tabular format for the results. It looks like this is a custom database of some sort, correct? It may be easier to re-run the blast otherwise you are going to need to parse this file.
Thank you very much! Yes, this is a custom database.
After one night enough sleep, I get your point.