Question: remove duplicated entries from BLAST result file
0
gravatar for Manoj
3.7 years ago by
Manoj30
Canada
Manoj30 wrote:

How I can remove duplicated entries from command line BLAST results file.

Thanks,

blast alignment • 1.8k views
ADD COMMENTlink written 3.7 years ago by Manoj30
2

define "duplicatedentries". Hsp ? Hit ?

ADD REPLYlink written 3.7 years ago by Pierre Lindenbaum116k

thanks,

Could you describe this in command format.

ADD REPLYlink written 3.7 years ago by Manoj30
3

Sure.

echo "What do you mean by \"duplicated entries\"? Are you referring to HSPs, hits, or something else entirely?"
ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by RamRS20k

Yes, i mean duplicate hits. as an example, like >gi|10001|, >gi|1000| such as duplicate entries..

ADD REPLYlink written 3.7 years ago by Manoj30

Are these duplicate hits seen for the same query? If they are seen for different queries, they are not duplicates - just asking so we are closer to defining the problem.

ADD REPLYlink written 3.7 years ago by RamRS20k

Yes..these are the duplicate hits that generated by same hits..is any way at command line BLAST to remove these type of all duplicated hits from result file. 

ADD REPLYlink written 3.7 years ago by Manoj30

I'm sorry, duplicate hits by same hits? Are you referring to multiple HSPs perhaps? Remember, each query can have multiple hits. Each hit can have multiple HSPs. Can you give us a sample maybe?

I don't think BLAST would give you a multiple hits where the subject as well as the query are the same sequence. BLAST would just specify them as a single hit with multiple HSPs, unless we are dealing with repeat sequences, in which case repeatmasker could be a prudent pre-processing step.

ADD REPLYlink written 3.7 years ago by RamRS20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1807 users visited in the last hour