Are There Local Aligner Like Blast Working On Unicode Characters?
1
0
Entering edit mode
13.1 years ago
fubioinf ▴ 30

I want to use a blast-like local aligner to find common substring between common human readible texts. e.g. social media data. A normal plagiarism finder won't do because the intention is to align GB and TB of data.

Hence, does anybody know NGS local aligner which work on unicode?

blast alignment • 2.0k views
ADD COMMENT
2
Entering edit mode
13.1 years ago

I think your are looking at the wrong tools/domain. Text similarity search is a subdomain of its own - with countless tools and techniques that work far better than trying to align sentences. If you have not found a tool that works on terrabytes of data it just means you have not looked exhaustively enough.

You should be looking for a tool that was specifically designed to do this, blast will likely not produce results that are meaningful or easy to interpret. Sukhdeep's link already has some pointers.

Try Citeseer as a search engine to find what you are looking for: http://citeseerx.ist.psu.edu/

ADD COMMENT

Login before adding your answer.

Traffic: 4118 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6