Question: How To Merge Contiguous Blast Hsps! (-M 8 Tab)
2
gravatar for xiongtl2013
5.8 years ago by
xiongtl201340
xiongtl201340 wrote:

Hi, guys!

I performed blastx (-m 8) using a query file of many sequences, and for each target sequence, the output contains many fragmental hsps of significance, and these hsps have overlap positions or not.

so, how can i merge those closely related hsps into one via setting a flanking value (e.g <300bp) when these hsps match the same subject (different regions).

In the following figure, I want to transform the upper results to the lower ones

http://www.imagebam.com/image/0c13b2256142668

Thanks in advance!

Transform

scaffold16:1661-2239(+)       gi|471236998|ref|YP_007641386.1|  50.00  122  52  3  **225  578**  603  719  2e-53   126                   
scaffold16:1661-2239(+)       gi|471236998|ref|YP_007641386.1|  75.00  76   19  0  **1    228**  528  603  2e-53   108
scaffold16:1661-2239(+)       gi|333951646|gb|AEG25349.1|       52.10  119  54  2  **225  578**  604  720  7e-53   124
scaffold16:1661-2239(+)       gi|333951646|gb|AEG25349.1|       77.63  76   17  0  **1    228**  529  604  7e-53   109
scaffold28:2776872-2777385(-) gi|327335359|gb|AEA49877.1|       70.18  57   17  0  **173  343**  554  610  3e-30   90.5
scaffold28:2776872-2777385(-) gi|327335359|gb|AEA49877.1|       72.22  54   15  0  **1    162**  497  550  3e-30   67.0

To

scaffold16:1661-2239(+)       gi|471236998|ref|YP_007641386.1|   .      .    .  .  **1    578**   .    .   2e-53   **234**
scaffold16:1661-2239(+)       gi|333951646|gb|AEG25349.1|        .      .    .  .  **1    578**   .    .   7e-53   **233**
scaffold28:2776872-2777385(-) gi|327335359|gb|AEA49877.1|        .      .    .  .  **1    343**   .    .   3e-30   **157.0**
merge blast • 2.2k views
ADD COMMENTlink modified 5.8 years ago by Istvan Albert ♦♦ 79k • written 5.8 years ago by xiongtl201340
1
gravatar for brentp
5.8 years ago by
brentp22k
Salt Lake City, UT
brentp22k wrote:

You can get most of the way there doing this:

awk 'BEGIN{FS=OFS="\t"}{ a=$0; gsub(/\t/, "ZZZ", a); print $1,$7,$8,a }' blast.txt \
         | sort -k1,1 -k2,2n \
         | bedtools merge -nms > out.bed

with your input in blast.txt. The output will have the lines above, you'll just have to do a bit of parsing to split on "ZZZ" and put the appropriate start (2nd column) and end (3rd) into the right places.

ADD COMMENTlink written 5.8 years ago by brentp22k

What a clever skill! Thank you very much. It really solve the problem, but to a certain extent. Cause the script will merge all of HSPs which have the same query but different targets, that is to say, it will merge all the first four records listed above. What I want to merge are HSPs with the same target, but different positions (also the same query).

hope some more helps! Any ideas are welcome...

ADD REPLYlink modified 5.8 years ago • written 5.8 years ago by xiongtl201340
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1117 users visited in the last hour