Tool: CBioInfCpp.h as a C++ lib containing some functions for bioinformatics
2
gravatar for chernouhov sergey
10 months ago by
Russian Federation
chernouhov sergey40 wrote:

Dear Sirs.

Though I am not a professional programmer, bionformatics is very interesting interdisciplinary field for me.

I see it, the Python is a "standart language" in this field.

But when I solved problems at rosalind info, I used C++. So as a result a "lib of some function" has been borned.

The lib contains 3 groups of functions. The first one - input-output ones (in order to read-write vectors, matrixes, graphs from-to a file via only one commsnd as it is in Python).

The second group is "Working with strings". Contains some functions from computing GC-content, Edit Distance etc to finding all mutated strings in a given one.

The third is "Working with graphs". A data structure "Adjacency vector" is suggested. By the way, in general case, vertices may have negative integers assigned and graphs may have multiple loops and edges. Some function such as Eulerian Cycle, Path finding, topological sorting etc are implemented.

May it be useful for some tasks?

By the way, that algorithmic functions and problems should be included or maybe solved here?

I understand that this lib haven't a great majority of features. For example it is not able now to work with bioinformatic databases, but here I can not to implement it by myself only.

Free distributed source code and info is here: https://drive.google.com/open?id=1FQwsQm2kG_nTO45ab0yj52xtp6_B4IB2

(This is a link to directory (not to a file) that contains source code file and readme files)

My profile at Rosalind info http://rosalind.info/users/chernouhov/

Best regards, Chernouhov Sergey

Modifyed at 03 may 2019:

added to GitHub as users ask for it: https://github.com/chernouhov/CBioInfCpp-0-

But:

  • GitHub is a new experience for me, so probably I DO some mistakes there.
  • Why only GitHub is the trusted place? We may be free but only there?

I Do declare that I DO NOT clearly understand all about GitHub so nowdays I use it only as a filehosting as it is so popular place.

Best regards, Chernouhov Sergey

tool c++ • 994 views
ADD COMMENTlink modified 6 weeks ago • written 10 months ago by chernouhov sergey40
3

Hi, maybe you're interested in contributing to the c++ SeqAn library?

https://www.seqan.de/

ADD REPLYlink written 10 months ago by colindaven2.0k

Hi.

Thanks. It's a great idea. Why not?

By the way, maybe you use SeqAn library or maybe you participate in its development?

ADD REPLYlink modified 10 months ago • written 10 months ago by chernouhov sergey40

added to GitHub as users ask for it: https://github.com/chernouhov/CBioInfCpp-0-

But: - GitHub is a new experience for me, so probably I DO some mistakes there. - Why only GitHub is the trusted place? We may be free but only there?

I Do declare that I DO NOT clearly understand all about GitHub so nowdays I use it only as a filehosting as it is so popular place.

ADD REPLYlink modified 10 months ago • written 10 months ago by chernouhov sergey40
2

It’s not the only trusted place, its just become the most common/most well known. You can also see the source code directly, versus having to trust a file blindly.

Bitbucket, sourceforge, gitlab etc are all still used, just to varying and lesser extents.

ADD REPLYlink written 10 months ago by Joe16k

Well, we DID talk about GitHub but we DID NOT talk about the lib itself. Nowadays it is hosted at GitHub. But it is not it's key feature, as it is for every item - both good and bad - isn't it?

ADD REPLYlink written 9 months ago by chernouhov sergey40
2

There is no need to post the same comment multiple times (I know these threads can get a little disorganised over time, but once is enough).

I'm not sure I understand your question? There's nothing more to say regarding the lib or github as far as I can see? You've uploaded it in a nice, visible place. If people want to use it, they'll use it.

ADD REPLYlink written 9 months ago by Joe16k

GitHub is not just a file hosting site, it hosts and helps manage Git projects. By Git projects, I mean Git repositories with issues, Pull Requests, etc. Git repositories are essentially version-controlled code directories allowing for concurrent development and change tracking along with a host of other amazing features. If you're new to Git, you should definitely learn it as it will better your approach to software development.

ADD REPLYlink written 10 months ago by RamRS25k
2

I would consider putting the code on Github, rather than distributing it as a google link. People are often wary of downloading code from behind random links without first being able to inspect the source.

ADD REPLYlink written 10 months ago by Joe16k

Hi. Thanks. It is a good idea and I plan to do it a little later (as I haven't used Github yet).

But I must confess as nowadays the lib CBioInfCpp consists only one header file (as free source code) it is not so bad to use google drive too? Also there are 2 files - pdf and rtf - that contain the same description of the functions of CBioInfCpp in different formats (pdf and rtf). One may use any of them depending on preferable format.

ADD REPLYlink written 10 months ago by chernouhov sergey40
3

Well, look at it this way: I haven't clicked on your link yet, even though I trust you, because I don't know if it will take me to a page, or will start a download immediately. If it starts a download immediately, I don't know if I'm getting a zip file, naked source code, or something masquerading as either.

If you want to contribute to projects, or have people contribute to improving your code, github (and its friends) is absolutely the way to go. To get started with github you need only three commands really: git pull, git commit, git push. Everything else is a bonus ;)

There are plenty of good youtube tutorials etc to get you going.

ADD REPLYlink written 10 months ago by Joe16k

I'll see it.

But there are no zip or immediat downloads, it is a link to a directory

ADD REPLYlink modified 10 months ago • written 10 months ago by chernouhov sergey40

Sure, but its hard to tell that from the link alone, so people are unlikely to click it.

ADD REPLYlink written 10 months ago by Joe16k
1

I also don't think google drive is appropriate. Software in the days of google code, sourceforge etc was (and still in some cases is) far more poorly documented, intransparent, and unversioned. As a developer I think you'll enjoy github very much.

ADD REPLYlink written 10 months ago by colindaven2.0k
1

What jrj.healey said was the first thought to cross my mind. I'm not clicking on a google drive link. I really want to look at the code, the code structure and a README before I decide if something is worth a download.

ADD REPLYlink written 10 months ago by RamRS25k

As I see it, why do not to try to implement any tool, at least CBioInfCpp.h?

Maybe, there are any interesting problems for strings, graphs, etc?

As well why do not use for in/ out solving other tasks?

ADD REPLYlink written 9 months ago by chernouhov sergey40

Please don't add answers unless you are responding to the opening post. This is just a comment so I have moved it.

That said, I don't really understand your comment - what are you asking?

As I said before, you've already uploaded your code, if people find it, and want to use it, they will - there's nothing more to be done...

ADD REPLYlink written 9 months ago by Joe16k

It is my language troubles, I see.

I mean there may be some problems to solve and that it is interesting for me to solve such problems: both using this lib or no.

ADD REPLYlink modified 9 months ago • written 9 months ago by chernouhov sergey40
1
gravatar for chernouhov sergey
10 months ago by
Russian Federation
chernouhov sergey40 wrote:

added to GitHub as users ask for it: https://github.com/chernouhov/CBioInfCpp-0-

But: - GitHub is a new experience for me, so probably I DO some mistakes there. - Why only GitHub is the trusted place? We may be free but only there?

I Do declare that I DO NOT clearly understand all about GitHub so nowdays I use it only as a filehosting as it is so popular place.

ADD COMMENTlink modified 10 months ago • written 10 months ago by chernouhov sergey40

Github is now owned by Microsoft. I'd prefer it not to be owned by a big tech company, but that's life. Alternatives are gitlab. Nice one for making your (first?) steps into git, I doubt you'll regret it.

ADD REPLYlink written 9 months ago by colindaven2.0k
0
gravatar for chernouhov sergey
8 months ago by
Russian Federation
chernouhov sergey40 wrote:

23/06/2019 update:

  • Group of function "FindIn" has been updated.
  • Functions PairVectorCout, PairVectorFout has been updated.
  • Group of function "GraphCout" and "GraphFout" has been added. So nowadays one may "cout/ fout" a graph that is set by Adjacency vector to screen/ to file line by line: one edge in one line.
  • Function "StrToCircular" added for finding the circular string of minimal length of the given one.
  • Group of function MaxFlowGraph" has been added to help find Maximal Flow, the paths of the maximal flow network and max-flow min-cut in a graph.
  • A data structure "Adjacency map" (a modification of data structure for containing graphs "Adjacency vector") has been added. Adjacency map allows to have quicker access to edge’s weight, but it can’t work with multiple edges.
  • Functions for converting Adjacency vector to Adjacency map and conversely AdjVectorToAdjMap and AdjMapToAdjVector have been added. Note that Multiple edges will be joined together.
  • Function TandemRepeatsFinding has been added. It is intended for finding tandem repeats in the given string that may be useful for solving problems related to Microsatellite Instability etc.
ADD COMMENTlink written 8 months ago by chernouhov sergey40
0
gravatar for chernouhov sergey
7 months ago by
Russian Federation
chernouhov sergey40 wrote:

14.07.2019 update:

  • Function CIGAR1 has been added.
  • Group of function "GraphCout" and "GraphFout" has been updated (so nowadays one may "cout/ fout" a graph that is set by both Adjacency vector and Adjacency map to screen/ to file line by line: one edge in one line).
  • Function EditDistA as an extended version of the function EditDist has been added (returns not only the value of Edit Distance between 2 strings but also one possible version of the alignment itself).
ADD COMMENTlink written 7 months ago by chernouhov sergey40
0
gravatar for chernouhov sergey
6 months ago by
Russian Federation
chernouhov sergey40 wrote:

09.08.2019 update:

  • Group of function "NBPaths" (for finding maximal non-branching paths in a graph, both weighted or no, directed or no) has been added.
  • Functions ConsStringQ1 and ConsStringQ2 for building consensus string upon a given collection of strings according to their quality has been added. Note that due to little data for testing errors may be found here (please notify if you found any).
ADD COMMENTlink modified 6 months ago • written 6 months ago by chernouhov sergey40
0
gravatar for chernouhov sergey
5 months ago by
Russian Federation
chernouhov sergey40 wrote:

31.08.2019 update:

  • Function GenRandomUWGraph that generates a random unweighted graph (as its "Adjacency vector") has been added.
  • Group of function intended to find collection of vertices for each strongly connected component of directed graph and to find collection of vertices for each connected component of undirected graph has been added.
  • Group of function for counting edges multiplicity of a graph that is set by Adjacency vector has been added.
ADD COMMENTlink modified 5 months ago • written 5 months ago by chernouhov sergey40
0
gravatar for chernouhov sergey
4 months ago by
Russian Federation
chernouhov sergey40 wrote:

19.10.2019:

  • Added group of functions AdjVectorToAdjMegaMap, AdjMegaMapToAdjVector to convert Adjacency vector to/ from Adjacency mega-map (i.e. extended version of Adjacency map to contain graphs having different multiply edges).

  • Updated Group of function GraphCout and GraphFout to deal with mega-maps.

ADD COMMENTlink written 4 months ago by chernouhov sergey40
0
gravatar for chernouhov sergey
3 months ago by
Russian Federation
chernouhov sergey40 wrote:

03.11.2019

  • Group of functions Num updated.
  • Function ScoreStringMatrix that counts score (i.e. total number of mismatches) upon vector a of strings s added.
  • Function GPPM that generates a position probability matrix (PPM) added. Note that pseudocounts may be used (the formula (Ns+z)/(N+2*z) is implemented).
ADD COMMENTlink written 3 months ago by chernouhov sergey40
0
gravatar for chernouhov sergey
3 months ago by
Russian Federation
chernouhov sergey40 wrote:

26.11.2019

  • For the functions ConsStringQ1 and ConsStringQ2 (intended for finding consesus string, in doing so quality may be taken into consideration or no) the default method is set = 1.
  • Function JoinOverlapStrings for joining overlapping strings has been added (in doing so, quality may be taken into consideration or no). So if we need to join collection 0->ACGT, 1->TGTA, 1->TT, 10->TT, 11->TCA in any way without any additional info,we should set NoQuality = true, Aggregate = false, and have the result: 0->ATGTA, 10->TTC.
  • Function ProfileProbableMer to find all most probable j-mers in a given string upon a given position probability matrix (PPM) has been added.
  • Function CycleToPath has been added.
ADD COMMENTlink modified 3 months ago • written 3 months ago by chernouhov sergey40
0
gravatar for chernouhov sergey
6 weeks ago by
Russian Federation
chernouhov sergey40 wrote:

11.01.2020

  • Added group of functions UPGMA_UndirectedGraph and NeighborJoiningUndirectedGraph for tree generating (as undirected graph) upon a given distance matrix.
ADD COMMENTlink written 6 weeks ago by chernouhov sergey40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1610 users visited in the last hour