Question: String matching algorithms of biological sequences
0
gravatar for markic0001
10 months ago by
markic00010
markic00010 wrote:

Hi,

I would like test accuracy, speed of some basics string matching algorithm on biological sequence. Where can i find a good library (python, c, c#, ... whatever) with implementation of string matching algorithm or service on the web? Do you have something that would help me, advise, ...?

sequence gene genome • 711 views
ADD COMMENTlink modified 10 months ago • written 10 months ago by markic00010

Whats wrong with strstr, or grep

ADD REPLYlink written 10 months ago by kloetzl990

Nothing, but i need more algorithms with scientific approach and compare them on different data sets.

ADD REPLYlink written 10 months ago by markic00010
1

I am predicting, it will be hard to beat strstr or pcmpestri unless you do some precomputation on the haystack (suffixtree etc.).

ADD REPLYlink written 10 months ago by kloetzl990

Do you have python library?

ADD REPLYlink written 10 months ago by markic00010
2
gravatar for Pierre Lindenbaum
10 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum114k wrote:

EXACT STRING MATCHING ALGORITHMS / Christian Charras - Thierry Lecroq : http://www-igm.univ-mlv.fr/~lecroq/string/ "Brute Force algorithm Deterministic Finite Automaton algorithm Karp-Rabin algorithm Shift Or algorithm Morris-Pratt algorithm Knuth-Morris-Pratt algorithm Simon algorithm Colussi algorithm Galil-Giancarlo algorithm Apostolico-Crochemore algorithm Not So Naive algorithm Boyer-Moore algorithm Turbo BM algorithm Apostolico-Giancarlo algorithm Reverse Colussi algorithm Horspool algorithm Quick Search algorithm Tuned Boyer-Moore algorithm Zhu-Takaoka algorithm Berry-Ravindran algorithm Smith algorithm Raita algorithm Reverse Factor algorithm Turbo Reverse Factor algorithm Forward Dawg Matching algorithm Backward Nondeterministic Dawg Matching algorithm Backward Oracle Matching algorithm Galil-Seiferas algorithm Two Way algorithm String Matching on Ordered Alphabets algorithm Optimal Mismatch algorithm Maximal Shift algorithm Skip Search algorithm KMP Skip Search algorithm Alpha Skip Search algorithm"

and their implementations in C...

ADD COMMENTlink written 10 months ago by Pierre Lindenbaum114k

Thought this was spam (Dawg matching algorithm?) before I saw the embedded link :-)

ADD REPLYlink written 10 months ago by genomax58k

Do these algorithms work properly?

ADD REPLYlink written 10 months ago by markic00010
1
gravatar for sacha
10 months ago by
sacha1.6k
France
sacha1.6k wrote:

Have a look on seqan c++ library. http://seqan.readthedocs.io/en/master/

For instance, it uses 2 bits per nucleotides instead 8 used by plain text sequence.

ADD COMMENTlink written 10 months ago by sacha1.6k

Thanks. I will try this. It seems good and easy to use.

ADD REPLYlink written 10 months ago by markic00010
1
gravatar for chen
10 months ago by
chen1.7k
OpenGene
chen1.7k wrote:

You may have interest to take a look at the source code of MutScan.

MutScan is based on DNA sequence string matching algorithm, and it can detect and visualize target mutations by scanning FastQ files directly.

ADD COMMENTlink written 10 months ago by chen1.7k

Thanks. If you have any good library please send it to me.

ADD REPLYlink modified 10 months ago • written 10 months ago by markic00010

Sorry, but I don't have a library for that.

ADD REPLYlink written 10 months ago by chen1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1521 users visited in the last hour