Question

Forum:Asking for opinions about Bioinformatics Data Skills book

2

Entering edit mode

9.9 years ago

Eric Normandeau 11k

I'd like to know if any of you have had a chance to get/read/evaluate the Bioinformatics Data Skills book and would like to share your opinion.

Is it globally a good book?
Who is the target audience?
Would it be helpful to biology students starting out in NGS projects?

Here is a link to the book on Amazon.com and a picture of its cover.

book • 5.5k views

ADD COMMENT • link updated 2.3 years ago by Ram 45k • written 9.9 years ago by Eric Normandeau 11k

2

Entering edit mode

More reviews: http://shop.oreilly.com/product/0636920030157.do

If you buy it from O'Reilly (as a member) I think you get another ebook for free - I always do at least. Sign up first and look around the O'Reilly member page.

ADD REPLY • link updated 2.7 years ago by Ram 45k • written 9.9 years ago by Endre Bakken Stovner ▴ 970

1

Entering edit mode

I'll let others answer your question so you get honest feedback, but I'm happy to answer any questions about the book. Do note that Bioinformatics Data Skills is an intermediate book - it assumes readers have learned a scripting language. This prerequisite allows me to dive into computational and data topics, without teaching beginning programming and how to use a text editor. I wrote it this way because (1) there's lots of great books that introduce computing to biologists and (2) few that introduce the topics I need day to day.

ADD REPLY • link updated 2.7 years ago by Ram 45k • written 9.9 years ago by Vince Buffalo ▴ 470

1

Entering edit mode

I just wanted to feel the pulse but I am convinced we need to buy your book for the lab. That would be the first bioinfo book I have this opinion about :)

ADD REPLY • link 9.9 years ago by Eric Normandeau 11k

Ram · Answer 1 · 2015-08-06

I'll just post my goodreads "review" (really just a jumble of thoughts)

5/5 Very good book.

And 85% of it will be just as relevant in ten years, which is an achievement.

Now I finally have something to recommend to all fledgling bioinformaticians; it contains all the stuff I had to learn the hard way during my first year or so: unix tools, git, R, ggplot, tests, file formats, pipelines etc.

There is nothing about how different bioinformatics algorithms are implemented nor how to run various current, but surely ephemeral bioinformatics software packages. This is a book for bioinformaticians who want to learn how to get stuff done with timeless tools.

Some things I would have included if I had written the book:

Perl isn't mentioned, which makes sense, since you should not use it to write longer programs. However, as a command line tool it is indispensable. This is because even though all unixes contain the tools sort, sed, awk etc. what options they have available varies between distributions. If you use sort with the V flag, grep with word boundaries or sed with ";" to separate commands (and so on...), it will not work on all *nixes. This is a problem if you use these options in a script and distribute it to others; the script might not work, produce the wrong result or have a completely different time complexity. This is where perl comes in; perl is standard everywhere so by replacing awk, sed, cut etc with perl one-liners your scripts will work the same everywhere, which makes your scripts more robust. Furthermore, the common command line tools are pretty anemic, while perl can do just about everything in one line. To learn command line perl, get Minimal Perl by Tim Maher.

As an alternative to bash, the equally archaic zsh is mentioned, but fish is not. Sure, you can tweak the flintstones car endlessly, just like the zsh, but it won't ever be as good as a Porsche. Get fish today: http://fishshell.com/ (Note that you can still use bash scripts by calling them with sh.)

Furthermore, it seems all the "recommended reading" requires at least one more SD of brainpower to understand than this book, so that list will probably not be useful for most.

If I were to add anything it would be that it lacks stuff on Python. With pandas, Python is probably nicer for data science than R, it is just that R has many more packages.

I think the book would be very good for people with CS/maths backgrounds like me that know some biology, but do not have much experience getting stuff done in bioinformatics. I think it would get them up to speed very quickly.

I think it would be harder for people with biology backgrounds. At least they can rest assured that all the stuff in this book will help them immensely once they do understand it and get some practice. There is nothing in here that is not "must-know" for someone who hopes to spend their life doing bioinformatics.

I think it was "globally" a good book; logically structured, well written without needless jargon (I am amongst the initiated who knows this stuff pretty well though- dunno if that alters my perspective greatly.)

score 3 · Answer 2 · 2015-08-07

I think it IS a good book. But definitely not for biologists who are just starting out with no experience. One of the earlier chapters is on git, for instance. I feel like it's a good book for new bioinformaticians to show them some best practices, stuff like that. Developing Bioinformatics Computer Skills is older, but maybe gentler for people new to all this stuff.

I also don't really agree with the statement that 85% of it will be just as relevant in ten years. That's just pretty hard to say, I think.