Entering edit mode
7.6 years ago
chenmy2007525
▴
10
I have a tblastn tabular output with millions of hits.P,Q and R were used as query. Con,Don and Eon were my contigs used to be blast. The tblastn output had many HSPs. For example, I used P to tblastn Con. p1,p2,p3 and p4 are four HSPs, and corresponding protein sequences in Con are C1,C2,C3 and C4. How can I connect C1,C1,C2,C3 and C4, and remove the overlapping sequence? I want to use the connected protein sequences to do sequence alignment.
P ------------------ ---- -
p1---- p2 ------ p4---
p3------
C1---- C2 ---- C4---
C3-----