Question: String matching algorithms of biological sequences
0
gravatar for markic0001
4 weeks ago by
markic00010
markic00010 wrote:

Hi,

I would like test accuracy, speed of some basics string matching algorithm on biological sequence. Where can i find a good library (python, c, c#, ... whatever) with implementation of string matching algorithm or service on the web? Do you have something that would help me, advise, ...?

sequence gene genome • 252 views
ADD COMMENTlink modified 25 days ago • written 4 weeks ago by markic00010

Whats wrong with strstr, or grep

ADD REPLYlink written 4 weeks ago by kloetzl810

Nothing, but i need more algorithms with scientific approach and compare them on different data sets.

ADD REPLYlink written 26 days ago by markic00010
1

I am predicting, it will be hard to beat strstr or pcmpestri unless you do some precomputation on the haystack (suffixtree etc.).

ADD REPLYlink written 25 days ago by kloetzl810

Do you have python library?

ADD REPLYlink written 25 days ago by markic00010
2
gravatar for Pierre Lindenbaum
27 days ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum104k wrote:

EXACT STRING MATCHING ALGORITHMS / Christian Charras - Thierry Lecroq : http://www-igm.univ-mlv.fr/~lecroq/string/ "Brute Force algorithm Deterministic Finite Automaton algorithm Karp-Rabin algorithm Shift Or algorithm Morris-Pratt algorithm Knuth-Morris-Pratt algorithm Simon algorithm Colussi algorithm Galil-Giancarlo algorithm Apostolico-Crochemore algorithm Not So Naive algorithm Boyer-Moore algorithm Turbo BM algorithm Apostolico-Giancarlo algorithm Reverse Colussi algorithm Horspool algorithm Quick Search algorithm Tuned Boyer-Moore algorithm Zhu-Takaoka algorithm Berry-Ravindran algorithm Smith algorithm Raita algorithm Reverse Factor algorithm Turbo Reverse Factor algorithm Forward Dawg Matching algorithm Backward Nondeterministic Dawg Matching algorithm Backward Oracle Matching algorithm Galil-Seiferas algorithm Two Way algorithm String Matching on Ordered Alphabets algorithm Optimal Mismatch algorithm Maximal Shift algorithm Skip Search algorithm KMP Skip Search algorithm Alpha Skip Search algorithm"

and their implementations in C...

ADD COMMENTlink written 27 days ago by Pierre Lindenbaum104k

Thought this was spam (Dawg matching algorithm?) before I saw the embedded link :-)

ADD REPLYlink written 27 days ago by genomax42k

Do these algorithms work properly?

ADD REPLYlink written 26 days ago by markic00010
1
gravatar for sacha
28 days ago by
sacha1.0k
France
sacha1.0k wrote:

Have a look on seqan c++ library. http://seqan.readthedocs.io/en/master/

For instance, it uses 2 bits per nucleotides instead 8 used by plain text sequence.

ADD COMMENTlink written 28 days ago by sacha1.0k

Thanks. I will try this. It seems good and easy to use.

ADD REPLYlink written 26 days ago by markic00010
1
gravatar for chen
27 days ago by
chen1.5k
OpenGene
chen1.5k wrote:

You may have interest to take a look at the source code of MutScan.

MutScan is based on DNA sequence string matching algorithm, and it can detect and visualize target mutations by scanning FastQ files directly.

ADD COMMENTlink written 27 days ago by chen1.5k

Thanks. If you have any good library please send it to me.

ADD REPLYlink modified 26 days ago • written 26 days ago by markic00010

Sorry, but I don't have a library for that.

ADD REPLYlink written 25 days ago by chen1.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 518 users visited in the last hour