Forum: Rust or C++, what to learn after Go for high-performance bioinformatics tools?
12
gravatar for shenwei356
9 days ago by
shenwei3565.5k
China
shenwei3565.5k wrote:

I write Golang code for 8 years. Golang is easy and fast to write high performance applications, and I can further adjust the performance with help of its great profiling tools. But it can't directly utilize SIMD instructions, so soon touches the performance ceiling, the last choose is write Golang assembly however which is very hard to learn. Besides, I have to rewrite many basic wheels due to lack of bioinformatics packages/libraries, biogo is good but not enough and the community is small.

C/C++ and rust are both suitable for writing high performance tools, and both with steep learning curve. Lots of great bioinformatics software are written in C++ and some tools in Rust (sourmash, et al.,) emerges recently. Both languages have comprehensive bioinformatics libraries (SeqAn, htslib; rust-bio, rust-htslib) and using other existing APIs is convenient and easy. Rust is so called "safer C++" and is kind of next trend, but C++ has more std/3rd-party libraries to utilize.

I learned basic C and Java at undergraduate school, and then Perl/Python and Golang at graduate school. I'm a postdoc now and time is limited, learning a new programming is actually not the best choice at present.

Any suggestion and view are welcome.

library forum bioinformatics • 1.1k views
ADD COMMENTlink modified 2 days ago by werewxcvv0 • written 9 days ago by shenwei3565.5k
2

Golang assembly is annoying, but it's not that big a step over SIMD intrinsics. My employer has open-sourced a bunch of examples at https://github.com/grailbio/base/tree/master/simd and https://github.com/grailbio/bio/tree/master/biosimd .

ADD REPLYlink written 9 days ago by chrchang5237.3k

Cool packages. Thanks for sharing.

ADD REPLYlink written 9 days ago by shenwei3565.5k
1

Certain folks on Biostar slack are fans of rust (@Rob Patro, @Wouter). c++ will make you generally marketable (though you probably have enough skills not to need that).

ADD REPLYlink written 9 days ago by genomax91k
11
gravatar for Rob
8 days ago by
Rob4.5k
United States
Rob4.5k wrote:

The forum only supports answers up to 5k characters, so this answer is broken into 2 parts (1/2):

This is an interesting question, and I think the answer depends on what your primary goal is. Istvan makes good points in favor of straight-forward C (integrated with Python) for building tools that are easy for others, without a ton of experience in programming languages, to modify etc. However, if your primary goal is to make efficient and robust tools for others to _use_, then let me offer a somewhat different perspective.

The language in which one develops a project has important implications beyond just the speed and memory profiles that will result. Both C++ and Rust will allow you to write very fast tools, making use of zero-cost abstractions where feasible, with predictable and frugal memory profiles (assuming you choose the right algorithms and are careful in designing the programs). However, two important areas where I see these languages diverging are safety and maintainability. By safety, I mean minimizing the types of nefarious memory and correctness bugs that can easily slip into "bare-metal" high performance code, and by maintainability I mean having code that is easy to update, fix and change in ways that are unlikely to cause other parts of the program to break in unpredictable or subtle ways. Having quite a lot of experience with C++ (and focusing recently on C++11/14/17) and a respectable but growing amount of experience in Rust, it is my opinion that Rust delivers strong improvements over C++ in terms of safety and maintainability. In general, initial implementations that I do in Rust are completed as quickly as those I do in C++, but they tend to have fewer subtle bugs and I am more immediately confident in the implementation. Personally, I find that it is easier to write maintainable Rust code because the compiler verifies so much more for you at compile time. Many breaking changes that would result in a runtime error or memory error in C++ will instead result in compile-time error in Rust. I much prefer the latter.

The other main benefit I'd claim for Rust is that the development ecosystem is just so much better. This may feel especially important coming from a language like Go. In Rust, depending on a package is as simple as adding a line to your Cargo file (a TOML format file describing build settings and dependencies). Even with modern tools like CMake, building larger projects in C++ inevitably becomes a pain. Rust, via cargo, also has _standard_ tools for things like formatting code (cargo fmt) and linting (clippy). These things may seem minor, but in the day-to-day of developing software and writing code, I find them to be powerful and important tools to improve both code quality and my life as a developer.

ADD COMMENTlink modified 8 days ago • written 8 days ago by Rob4.5k
10
gravatar for Rob
8 days ago by
Rob4.5k
United States
Rob4.5k wrote:

Part (2/2):

Finally, I would be remiss if I didn't mention that there clearly _are_ some benefits to (modern) C++. First, if you choose C++, I would set a minimum version no earlier than C++17. It adds a lot of important and nice new features that make writing clean, expressive and efficient code easier. C++ wins over almost all other languages in terms of library availability. If there is a library to do something, then C++ almost certainly has a library for it. Of course, the burden is on you to vet the quality and dependability of that library, but you can, with care, avoid a lot of wheel re-inventing. Second, C++ is very widespread. If you pick an arbitrary developer from the population, the chances are _much_ higher that they will know some C++ (though perhaps not modern C++) than that they will know any Rust. This also means the community is bigger and so you'll find more resources accumulated over years of others learning and struggling with the language — and there will be extensive material online for even the darkest and most obscure crevices of the language (and the common build systems like CMake). Finally, though both Rust and C++ are proper "low-level" languages that give you tremendous control over exactly how your program does what it does and how it interacts with the hardware, C++ has accumulated a few more features over it's life than has Rust, and there are some capabilities that are mostly unique to C++ at this point. For example, developing software in Rust, one feature that we missed from C++ recently is the ability to templatize a function or class on a non-type template parameter (e.g. templatizing on an integral constant). This is useful, when (automatically) instantiating specialized implementations of a function for e.g. k-mers of a specific length. This type of genericization based on values rather than types is an artifact of the (largely unexpected) expressiveness of the C++ template system. It's worth noting that features like this, that are certainly useful (even if quite niche) have a tendency to eventually make their way into languages like Rust that are open to them. For example, there is a Rust issue tracking the development of const generics, that would provide such functionality. They've made a lot of progress and the feature is very likely to make it into the language soon. However, by virtue of the sheer size of C++, there are likely to be a number of such features like this that already exist in C++ but may not yet be (or ever be?) in Rust.

At the end of the day, it would obviously be great to learn both languages. There's almost never a down side to learning a new language, as it helps to improve your programming ability and the tools you have to think about different problems. The issue is that learning new things takes time, and so there is a cost associated with learning a new language. If your primary goal is to integrate with the libraries / tools of others, then C++ is a more common language and is likely to let you plug in, more immediately, to a large number of existing projects. However, if your primary goal is to build and maintain your own tools, then it's my opinion that, in 2020, Rust provides a much better developer experience and you don't have to give up on the efficiency that you get from a language like C++.

ADD COMMENTlink written 8 days ago by Rob4.5k
1

Thank you for such comprehensive and valuable comment!

ADD REPLYlink written 8 days ago by shenwei3565.5k
4
gravatar for Istvan Albert
9 days ago by
Istvan Albert ♦♦ 85k
University Park, USA
Istvan Albert ♦♦ 85k wrote:

I do think that it is both productive and fun to learn new languages. It opens one's horizons and leads to a deeper understanding of computing.

With Rust, however, I feel that languages leave that approachability that even C has. It has many concepts that don't translate down and that will keep a large number of people out. I see it more like Common Lisp or OCaml - both amazing environments yet too "alien" for most people.

A "for loop" and "if branch" have mechanistic representations and that makes them easier to comprehend. Even pointers are conceptually simple, the challenge is that they lend themselves to misuse.

Rust seems to me so radically different that I have trouble believing we would ever be able to teach non-technical people, whose jobs are not primarily programming, to use Rust. That of course also means that whatever we program in Rust will tend to remain isolated and an "island" of sorts. This is just an opinion of course.

I personally believe that writing simple, high, performance C that also could be integrated with Python is what makes the world move forward faster. Time and again I see myself constantly writing shim programs just to get data produced by one program into another - custom little scripts with an untold number of limitations, none of which would be necessary if that beautifully crafted library could be connected and called directly from Python.

ADD COMMENTlink modified 8 days ago • written 9 days ago by Istvan Albert ♦♦ 85k

C/C++/Rust core + Python is the perfect combination!

ADD REPLYlink written 8 days ago by shenwei3565.5k
4
gravatar for Kamil
8 days ago by
Kamil2.0k
Baltimore
Kamil2.0k wrote:

If you're interested in performance, benchmarks, and language comparisons, you might be interested to read this post by Heng Li:

The post has a few benchmarks comparing popular languages: Rust, C, Crystal, Nim, Julia, Python, Javascript, LuaJIT, PyPy

Another interesting language that recently caught my eye is V: https://vlang.io

fast high level programming languages

ADD COMMENTlink written 8 days ago by Kamil2.0k
3
gravatar for Alex Reynolds
9 days ago by
Alex Reynolds31k
Seattle, WA USA
Alex Reynolds31k wrote:

If you want performance, learn C. If you want data structures and OOP, learn C++.

ADD COMMENTlink written 9 days ago by Alex Reynolds31k
1
gravatar for colindaven
9 days ago by
colindaven2.4k
Hannover Medical School
colindaven2.4k wrote:

I'm a big fan of rust. The way it pulls in crates and makes a binary after compilation (great for cluster environments) is very impressive. It's a lot more modern than C++ in my estimation, but as you said not easy to learn.

ADD COMMENTlink written 9 days ago by colindaven2.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1019 users visited in the last hour