remove duplicated entries from BLAST result file
0
0
Entering edit mode
8.9 years ago
Kumar ▴ 170

How I can remove duplicated entries from command line BLAST results file.

Thanks,

blast alignment • 3.7k views
ADD COMMENT
2
Entering edit mode

define "duplicatedentries". Hsp ? Hit ?

ADD REPLY
0
Entering edit mode

thanks,

Could you describe this in command format.

ADD REPLY
5
Entering edit mode

Sure.

echo "What do you mean by \"duplicated entries\"? Are you referring to HSPs, hits, or something else entirely?"
ADD REPLY
0
Entering edit mode

Yes, i mean duplicate hits. as an example, like >gi|10001|, >gi|1000| such as duplicate entries..

ADD REPLY
0
Entering edit mode

Are these duplicate hits seen for the same query? If they are seen for different queries, they are not duplicates - just asking so we are closer to defining the problem.

ADD REPLY
0
Entering edit mode

Yes..these are the duplicate hits that generated by same hits..is any way at command line BLAST to remove these type of all duplicated hits from result file.

ADD REPLY
0
Entering edit mode

I'm sorry, duplicate hits by same hits? Are you referring to multiple HSPs perhaps? Remember, each query can have multiple hits. Each hit can have multiple HSPs. Can you give us a sample maybe?

I don't think BLAST would give you a multiple hits where the subject as well as the query are the same sequence. BLAST would just specify them as a single hit with multiple HSPs, unless we are dealing with repeat sequences, in which case repeatmasker could be a prudent pre-processing step.

ADD REPLY
0
Entering edit mode

Hey RamRS.

I have an HSPs table. There are some cases in which queries of different length hit regions of the subject that overlap. The table has a column with the coordinates per HPS. I am interested in removing the redundant HPSs, is there a script available I can use. Thanks in advance for your reply.

ADD REPLY

Login before adding your answer.

Traffic: 2132 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6