How I can remove duplicated entries from command line BLAST results file.
define "duplicatedentries". Hsp ? Hit ?
Could you describe this in command format.
echo "What do you mean by \"duplicated entries\"? Are you referring to HSPs, hits, or something else entirely?"
Yes, i mean duplicate hits. as an example, like >gi|10001|, >gi|1000| such as duplicate entries..
Are these duplicate hits seen for the same query? If they are seen for different queries, they are not duplicates - just asking so we are closer to defining the problem.
Yes..these are the duplicate hits that generated by same hits..is any way at command line BLAST to remove these type of all duplicated hits from result file.
I'm sorry, duplicate hits by same hits? Are you referring to multiple HSPs perhaps? Remember, each query can have multiple hits. Each hit can have multiple HSPs. Can you give us a sample maybe?
I don't think BLAST would give you a multiple hits where the subject as well as the query are the same sequence. BLAST would just specify them as a single hit with multiple HSPs, unless we are dealing with repeat sequences, in which case repeatmasker could be a prudent pre-processing step.
I have an HSPs table. There are some cases in which queries of different length hit regions of the subject that overlap. The table has a column with the coordinates per HPS. I am interested in removing the redundant HPSs, is there a script available I can use. Thanks in advance for your reply.
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy