Question: How To Concatenate Blast Results (M8) Via Setting Threshold Of Distance Between Two Query Hits
gravatar for xiongtl2013
6.1 years ago by
xiongtl201340 wrote:

hi, dear guys

I performance blastn (-m 8) using a query file of many sequences, and for each query sequence, the output contains many fragmental hits of significance.

however, these hits have no overlap, and what is interesting is that most gaps < 300bp (much shorter than full-length of the query sequence).

so, how can i concatenate those closely related hits into one via setting a value (e.g 300bp) when these hits match the same subject (different regions), ——also to reduce the number of output hits per query.

for example:

are there any scripts or tools for this purpose?

all your replies are welcome!

blast • 1.9k views
ADD COMMENTlink modified 2.4 years ago by Lhl730 • written 6.1 years ago by xiongtl201340
gravatar for jgibbons1
6.1 years ago by
jgibbons150 wrote:

You can use Biopython to parse the blast output and then concatenate the sequences that match your criteria. I would not recommend parsing the tabular output though, instead re-run blast and get the results in xml format since the that is easier to parse using a script.

In the biopython tutorial the chapters you would be interested in are 3, 4, 5, and 7.

ADD COMMENTlink written 6.1 years ago by jgibbons150
gravatar for Lhl
2.4 years ago by
United States
Lhl730 wrote:

Have you tried genBlastA/G ? (She et al., 2011) genBlastG: using BLAST searches to build homologous gene models. Bioinformatics.

ADD COMMENTlink written 2.4 years ago by Lhl730
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 812 users visited in the last hour