Select Sequence From A Psiblast, And Launch A New Iteration From It
1
0
Entering edit mode
11.5 years ago

I would like on my local machine or remotely if it is possible too, launch a psiblast from a sequence. Then, like on the website of ncbi, select my own sequences from the result file. And launch another iteration from these sequences.

When I launch psiblast on local I obtain a pairwise alignment in xml or other format. How can I select my sequences to go to run another iteration? If I select sequences, I need to make a MSA so, I destroy the blast result if evalue if very high. I would like to do exactly the same choice of psiblast but in a script (which launch blast localy or remotely)

for instance :

blastpgp -i mysequence -j 1 -o mysequence.blast

--> I create a file just with the sequences that I want to keep (not only significative sequence), -> mynewsequenceTo2d_iteration

blastpgp -i mynewsequenceTo2d_iteration -j 1 -o 2diteration.blast

etc...

Is it possible? I did not find any way to do it... I make a multiple alignment between each iteration but Psiblast in ncbi do not that...

I take every way to do it, even with Biopython or CGI

ncbi blast • 3.6k views
ADD COMMENT
0
Entering edit mode
10.9 years ago
Hamish ★ 3.2k

In the EMBL-EBI's PSI-BLAST service the interactive iteration mechanism is implemented with the following workflow (using legacy NCBI BLAST):

  1. Round 0:
    1. Run the initial PSI-BLAST iteration (-j 1) of the query sequence against the search database.
    2. Process the output to select the sequences to use to build the PSSM (i.e. the sequences to use for the next iteration).
  2. Round 1:
    1. Fetch the selected sequences from the previous round from the BLAST database (using 'fastacmd')
    2. Build a BLAST database with the selected sequences and 10000 random sequences (generated with EMBOSS makeprotseq)
    3. Run a PSI-BLAST with the query sequence against the database of selected sequences for 2 iterations (-j 2) with an high E-value threshold (-e 1000) outputting a binary checkpoint (-C chkpn1t -u 2)
    4. Run a PSI-BLAST with the query sequence and the generated checkpoint (-R chkpnt1 -q 2) against the search database for one iteration (-j 1) and outputting a binary checkpoint (-C chkpnt2 -u 2)
    5. Process the search output to select the sequences to use to build the PSSM (i.e. the sequences to use for the next iteration).
  3. Round 2 onwards:
    1. Fetch the selected sequences from the previous round from the BLAST database (using 'fastacmd')
    2. Build a BLAST database with the selected sequences and 10000 random sequences (generated with EMBOSS makeprotseq)
    3. Run a PSI-BLAST with the query sequence and the checkpoint from the previous round (-R chkpnt2 -q 2) against the database of selected sequences for 2 iterations (-j 2) with an high E-value threshold (-e 1000) outputting a binary checkpoint (-C chkpnt1 -u 2)
    4. Run a single iteration PSI-BLAST with the query sequence and the generated checkpoint (-R chkpnt1 -q 2) against the search database for one iteration (-j 1)
    5. Process the search output to select the sequences to use to build the PSSM (i.e. the sequences to use for the next iteration).

The same general principle is used to implement interactive iterations for PSI-Search except it uses SSEARCH (Smith & Waterman) to perform the actual searches instead of PSI-BLAST, but PSI-BLAST is still used to generate the PSSMs.

Thanks to Bill Pearson for figuring out how to do this.

Note that in the EMBL-EBI's services the implementation of PSI-BLAST and PSI-Search are a little more complex than that outlined above due to the addition of support for a Homologous Over-Extension (HOE) prevention method (see http://dx.doi.org/10.1093/nar/gkp1219 and http://dx.doi.org/10.1093/bioinformatics/bts240). So for cases where the databases provided in the EMBL-EBI services are enough you might want to use the Web Services to perform searches (see http://www.ebi.ac.uk/Tools/webservices/#sequence_similarity_search_sss).

ADD COMMENT

Login before adding your answer.

Traffic: 2689 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6