News: The Biostar Handbook. A bioinformatics e-book for beginners.
133
gravatar for Istvan Albert
5 months ago by
Istvan Albert ♦♦ 71k
University Park, USA
Istvan Albert ♦♦ 71k wrote:

Feb 6, 2017: Issue Tracking: There is a GitHub repository that tracks issues, problems and suggestions for the book. If you experience any type of technical problem please visit the site below to create a new issue.

https://github.com/biostars/biostar-handbook-issues/issues


Announced almost 18 months ago, the Biostar Handbook has now been published. It delivers simple, concise, and relevant information for those looking to understand the field of bioinformatics as a data science.

enter image description here

It is a comprehensive, practical handbook that aims to cover (though it is not quite there yet) all major application areas of bioinformatics.

Special thanks go to Biostar users genomax2 , shenwei356 and Jeremy Leipzig who have contributed entire pages or sections to the book.

Only now that the book released - as I am looking at 713 pages of do I start to realize just how big Bioinformatics has gotten in the past few years. And we're still missing entire subdomains of it: Metagenomics, Assembly, ChIP-Seq. But fear not we'll handle those too in this coming year.

Spread the word, let others know - I think there is no other resource like it. I like to call it data analysis with attitude, where reproducibility means not following letter by letter, but doing it better, faster and simpler.

Let me invite anyone that wishes to contribute to do so. It is easy, and simple, Markdown based publishing. And there is so much more that could be done and will be done. Be a part of it! We are independent, self published, self supported. Chart your own course, bring your own ideas and goals to fruition or just enjoy being a part of a creative process.

Edit:

See also (closed now): How to get Biostar Handbook for free. And participate in building a better educational platform.

handbook training news tutorial • 9.3k views
ADD COMMENTlink modified 5 hours ago by kenanh0 • written 5 months ago by Istvan Albert ♦♦ 71k
11

Please join me in Congratulating @Istvan on this special day!

This will hopefully be THE bioinformatics information resource for both students and instructors (and anyone else who may be just curious). Looking forward to continued collaboration.

ADD REPLYlink modified 5 months ago • written 5 months ago by genomax27k
2

Congratulations @Istvan. About to buy the handbook. Looking forward to having this on my desktop!

ADD REPLYlink written 5 months ago by avismirabilis30

Thank's everyone for the encouragements. Proofreading is my bane - the hardest part for me to do for this book. So there are plenty of awkward sentences - but just about every day I push out more fixes to wording.

ADD REPLYlink written 5 months ago by Istvan Albert ♦♦ 71k
2

On thing I forgot to mention - everything in this book works on Windows as well! The new Bash for Windows subsystem worked amazingly well. Only fastq-dump had problems connecting.

I had been in contact with the Microsoft Genomics team and they reached out to the Bash Team - sure enough they put in a fix just to make fastq-dump work for us. This will be pushed out in some of the updates to OS.

I was very impressed how well it all worked out on Windows as well.

ADD REPLYlink written 5 months ago by Istvan Albert ♦♦ 71k

May I know the way of contacting withthe Ubuntu Microsoft people? I am aware of some problems with another tools, such us fastx-toolkit

ADD REPLYlink written 5 months ago by Antonio R. Franco3.3k

I have a personal contact that put me in touch so don't feel comfortable sharing that information.

As for the fastx-toolkit I would recommend not using it - it is not the right tool anymore.

ADD REPLYlink written 5 months ago by Istvan Albert ♦♦ 71k

I am not a fan of fastx-toolkit. It cannot work with more than one threat and it is slow.

But the point is that some error is there avoiding this program and maybe some other programs to work nicely with Window-Linux. I haven't tried the latest version, though..

ADD REPLYlink written 5 months ago by Antonio R. Franco3.3k
1

Congratulations! So nice for you to make money from open source community efforts.

ADD REPLYlink written 5 months ago by Zaag530
3

I understand the sentiment and why you would post this. I thought of this myself as well. The book even has a FAQ titled: Why isn't the book free? - in a nutshell, creative process is fun, cleaning up after other people's creative processes is hard work.

ADD REPLYlink modified 5 months ago • written 5 months ago by Istvan Albert ♦♦ 71k
1

I understand the sentiment and why you would write such a FAQ, but I just do not believe you make the world a better place by selling information. The Internet is full of people who create things for free, no matter how much work they put in. However, it is not about the money you ask, it 's about using the name of a open source community (where a lot of people are willing to spent their precious time on helping others) to sell you product.

ADD REPLYlink modified 5 months ago • written 5 months ago by Zaag530
1

Content of the book is mainly based on a graduate course @Istvan has taught at Penn State for last few years. AFAIK content of the book is freely accessible over the net (you only need to create an account). Only those wishing to obtain a PDF/eBook copy are expected to make a purchase. Rationale for that decision is described in the FAQ.

Disclosure: I did contribute some content to the book under the licensing described here without any expectation of financial gain.

ADD REPLYlink modified 5 months ago • written 5 months ago by genomax27k
1

So why call it the BioStars handbook and not the @Istvan Penn Stat course handbook? My guess is to make more money.

And I tried to make an account just now but I need to pay for the book to get one.

ADD REPLYlink written 5 months ago by Zaag530

My apologies. I thought the online version was "free to access" but I was wrong. I have made a correction to my post above.

ADD REPLYlink written 5 months ago by genomax27k
1

No sir, information will cost you $$$.

ADD REPLYlink written 5 months ago by Zaag530
4

I am contributing to the initiative - both time and money. I've paid for the book and I'm going to be working on the content. To me, it's not about the money. $30 for 2 years' access is peanuts. I've always admired Istvan's efforts in building this community and a site that multiple communities could tailor to their use case.

I have a philosophy that I follow for this free/paid thing, especially for my favorite websites, YouTube channels and webcomics. If someone gave me access to their work for free, and they're selling an offshoot, I'll pay for it gladly. I want them to make a profit with their skills, I want to benefit from their skills - I don't want them to stop because it gave them no return.

If Istvan had asked for $10 a year to help maintain the site, I'd have paid it gladly.

ADD REPLYlink modified 5 months ago • written 5 months ago by Ram11k
1

Congratulations, stellar book with great content and elegant design. Will be ordering a print version - hope I can get a signed copy :)

ADD REPLYlink modified 5 months ago • written 5 months ago by Khader Shameer17k
1

No print version as of yet though, but come to think of it I can sign PDFs!

ADD REPLYlink written 5 months ago by Istvan Albert ♦♦ 71k

Now I could recommend my biology friends a one-stop solution to begin with bioinformatics. Thanks a lot Istvan (and others), for the efforts behind this book and ideas. I wish this was published when I started Bioinformatics.I am eager to read this book and learn more Bioinformatics.

ADD REPLYlink written 5 months ago by venu3.6k

Wonderful, congrats!

ADD REPLYlink written 5 months ago by Anima Mundi2.1k

Great resource, bought a copy and have been reading through it all day, lots of useful tips and explanations. Well done everyone involved.

ADD REPLYlink written 5 months ago by James Ashmore1.5k
1

Thanks James Ashmore we will keep at it, we will expand it - I predict doubling the content by Summer and always "keeping it real".

ADD REPLYlink written 5 months ago by Istvan Albert ♦♦ 71k

hello I've been using Biostars for 1 year so i bought the book , i will like to add if it is possible to add chapter about small RNA and long non coding RNA and how to analyze their data. also is it possible to add tips how to deal with non-model organism and how to annotate genome based on transcriptome. congratulation for this nice book.

ADD REPLYlink written 5 months ago by najibveto20

Early next year we will send out a questionnaire for the readers to ask them what they want covered. And we will try add if not all but most of the topics people they want covered.

The Biostar Handbook is not the typical book where we put it out there, then walk away. That's really why I am doing this - I want a "living" book in a different kind of publishing system one that works like software itself. It adapts and changes with time, you can upgrade or use the prior versions.

ADD REPLYlink written 5 months ago by Istvan Albert ♦♦ 71k

I have opened a new thread to channel discussions on the book there and keep this more focused on the announcement itself.

How to get Biostar Handbook for free. And participate in building a better educational platform.

ADD REPLYlink written 5 months ago by Istvan Albert ♦♦ 71k

Thanks! this is an excellent and relevant resource.

I was wondering what was your plan next. Will you add more softwares examples like PLINK! or GEMMA? Like more info on how to run a basic molecular ecology problem using GWAS?

ADD REPLYlink written 4 months ago by beausoleilmo70
1

The current plan is to cover various domains of applications over the next year. ChIP-Seq, small RNA and so on. We're interested in covering other software like PLINK as well.

ADD REPLYlink written 4 months ago by Istvan Albert ♦♦ 71k

Many congratulations on publishing the book, yet another feat you (other contributors) have achieved after the Biostars.org itself. I have advertised book to many people already who ask me on a daily basis how to start in Bioinformatics, clear, concise and to the point. Would love to contribute to the future episodes. xx

ADD REPLYlink written 4 months ago by Sukhdeep Singh8.8k
3
gravatar for jiwpark00
5 months ago by
jiwpark00130
jiwpark00130 wrote:

Congratulations! I've been using Biostars for past 6+ months so I just bought the book. I'm intermediate-savvy with bioinformatics (I can usually figure out what to do if I search on Biostars or Google but I don't know all the codes off my head) so this book will be great in getting my skills up-to-date.

I already like that you included a section of Unix commands - that's something a lot of bioinformatic workshops don't go in enough depths about.

What would be the best way for contributing? Does GitBook work like Github? I think it would be great if there is a section on maybe WGCNA (or at least mention it). Maybe this is not applicable to everyone but majority of people I've talked to (and many are wet-lab biologists) really want to be able to take RNA-seq data and put into WGCNA but tutorials available are pretty abstract (UCLA tutorial requires some experiences, for instance).

Overall an amazing tool! It's like an early Christmas gift :-)

ADD COMMENTlink written 5 months ago by jiwpark00130

I am planning to do a tutorial and vignette of WGCNA later on after refactoring a little bit the code. I am adding test and updating the package in this github repository https:github.com/llrs/WGCNA.

ADD REPLYlink written 5 months ago by Lluís R.440
2
gravatar for Sentinel156
5 months ago by
Sentinel15670
Melbourne, Australia
Sentinel15670 wrote:

I've been following this for the last couple of months and really glad to see it released! It was an instant buy for me as a wet-lab phd student interested in bioinformatics. I was wondering if it is possible to supply an epub version alongside mobi for those without kindle e-readers? Also is there a plan to perhaps have an online repository on github or similar for code examples and workflows from the book? This could be similar to Vince Buffalo's Bioinformatics Data Skills book. Overall I can't wait to get stuck into reading it, i'm sure it will save me and many others hours of time searching for help online, so congratulations and thanks again to all involved for all of the hard work putting this together!

ADD COMMENTlink written 5 months ago by Sentinel15670
2

Great idea on the github repo. We'll do it - though might take until next year to deploy

ADD REPLYlink written 5 months ago by Istvan Albert ♦♦ 71k
2

The epub versions have been now added. See also the update page:

https://read.biostarhandbook.com/public/updates.html

ADD REPLYlink written 5 months ago by Istvan Albert ♦♦ 71k
1

Oh indeed my mistake I thought mobi will work on both. I can and will generate epub formats as well.

I am traveling ATM so I might not be able to do it until later tomorrow.

ADD REPLYlink written 5 months ago by Istvan Albert ♦♦ 71k
2
gravatar for Istvan Albert
3 months ago by
Istvan Albert ♦♦ 71k
University Park, USA
Istvan Albert ♦♦ 71k wrote:

There is a GitHub repository that tracks issues, problems and suggestions for the book.

If you experience any type of technical problem please visit the site below to create a new issue.

https://github.com/biostars/biostar-handbook-issues/issues

ADD COMMENTlink written 3 months ago by Istvan Albert ♦♦ 71k
1
gravatar for Daniel
5 months ago by
Daniel3.3k
Cardiff University
Daniel3.3k wrote:

This looks great, impressed to see chapters dedicated to things that you can only usually get a handle on from hours or days of google-fu (was looking forward to checking out sratools, entrez and bioawk).

EDIT: I originally had a comment on empty sections, but I'm throwing it out because it's totally misleading. I was totally wrong, and as Istvan pointed out I've skipped straight over the section on analysis and inadvertently found myself in the tools section.

ADD COMMENTlink modified 5 months ago • written 5 months ago by Daniel3.3k
1

I am not sure I follow. RNA-Seq has detailed sections in the book. There are 12 sections on RNA-Seq with very detailed instructions on how to perform the analysis. There are five different fully worked out examples from evaluating controls samples to two ways of doing the Zika analysis. It all starts here:

https://read.biostarhandbook.com/rnaseq/rnaseq-intro.html

The tool chapters as just how to install them and perhaps on how to test.

ADD REPLYlink modified 5 months ago • written 5 months ago by Istvan Albert ♦♦ 71k

I'm really sorry, as you've pointed out, I've scrolled through to the tool pages in the pdf without realising that I've passed over the instructional and data analysis sections. I'm going to edit my comment now.

A couple of suggestions would be maybe to rename "Using Tophat2", "Using Kalisto" to "Installing Tophat2", "installing Kalisto" etc, and maybe introduce a hard page break into the contents pages (Maybe number them appendices?) as I didn't realise that the sections had changed when scrolling through 10 contents pages.

It's my fault for skimming quickly though. Significantly more than I saw on first pass.

ADD REPLYlink written 5 months ago by Daniel3.3k

This is a good point - that naming is confusing I will change it and push the new version.

ADD REPLYlink written 5 months ago by Istvan Albert ♦♦ 71k
1

The naming has been changed to "Install" - new pdfs are available as well.

ADD REPLYlink written 5 months ago by Istvan Albert ♦♦ 71k

The same applies for every tool. The book is not tool centric - it is task centric.

We never want to use just use seqtk we always have problems that can be solved with seqtk.

One would only visit the dedicated seqtk page only when installing it first or to verify some limitation and problem that this tool might have. That is why each tool has a separate page - and sometimes that page is somewhat empty when the tool does not have many issues affecting it . Still worth having it on its own page as it makes locating it very easy via search. Just type seqtk in the search box and you'll get the pages

ADD REPLYlink modified 5 months ago • written 5 months ago by Istvan Albert ♦♦ 71k
1
gravatar for june.shuer.deng
3 months ago by
june.shuer.deng10 wrote:

I am the first person in my lab to attempt bioinformatics analysis using command line (with zero training). This book has been extremely helpful, covering some important basic concepts and moves on the complex & practical analysis.

The most brilliant thing about this book, is that it is unbelievably fun to read! People in the office caught me laughing out loud while 'studying' bioinformatics several times. The author has a very interesting character! He is passionate about what he does, while occasionally pokes fun of the irrational reality. It really feels like having a tutor telling me things beyond the textbook. Thank you for making me clueless attempt surprising enjoyable.

ADD COMMENTlink written 3 months ago by june.shuer.deng10
1

Thanks for the feedback. Much appreciated! I was hoping to make learning more enjoyable and fun.

And, as you point out it was indeed one of my goals to get people to see the some of the idiosyncratic aspects of bioinformatics. With no doubt the field is home to geniuses when it comes to software development - yet at the same time a suspension of disbelief is necessary to deal with the emerging complexity caused by going forward too fast.

Laughing at it (or with it) is how we deal in our group.

ADD REPLYlink written 3 months ago by Istvan Albert ♦♦ 71k
1
gravatar for germelcar
3 months ago by
germelcar20
Mexico/Ensenada/CICESE
germelcar20 wrote:

Is correct to try to cite the handbook? If yes, how would I cite it?

BWT, what about de novo transcriptome assembly? Would be great to include a section for that.

For fastq-dump, I see that the Handbook uses version 2.5.2. I remember a notice from SRA-Toolkit's github page saying that one should upgrade because of the https update on the NCBI platform. I see also that the "--split-files" option for paired-end data is used, what about "--split-3" option? I have noticed some subtle differences (and better results for using Trinity) when I use "--split-3" option instead of "--split-files" option.

Thanks in advance.

ADD COMMENTlink written 3 months ago by germelcar20
1

Create an "issue" for the fastq-dump question here.

ADD REPLYlink written 3 months ago by genomax27k

Thanks for the reply.

ADD REPLYlink written 3 months ago by germelcar20
0
gravatar for ahramkim1128
4 months ago by
Korea/Daejeon/KAIST
ahramkim11280 wrote:

In https://read.biostarhandbook.com/ontology/sequence-ontology.html page, URL=https://raw.githubusercontent.com/The-Sequence-Ontology/SO-Ontologies/master/so-xp-simple.obo curl $URL > so.obo doesn't work. There is no file in the URL. Please edit this in the next update version. :)

ADD COMMENTlink written 4 months ago by ahramkim11280

Please try this URL instead. We will correct it in the book.

ADD REPLYlink modified 4 months ago • written 4 months ago by genomax27k
1

Goes to show just how links "rot" within a month or two. Starting with next year we will do nightly builds of all code in the book - I now realize that the book itself needs to be treated as if it were software - needs to be re-built every day (or perhaps every hour)...

The link has now been fixed in the web version and will make it into the PDF versions once these are updated.

ADD REPLYlink written 4 months ago by Istvan Albert ♦♦ 71k

wonderdump doesn't work. I did

  1. curl http://data.biostarhandbook.com/scripts/wonderdump.sh > ~/bin/wonderdump
  2. chmod +x ~/bin/wonderdump
  3. wonderdump SRR1972739 -X 10000 --split-files

and I saw "wonderdump: no command for wonderdump."

In addition, there is no content about installing wonderdump .

ADD REPLYlink modified 4 months ago • written 4 months ago by ahramkim11280
1

this is not the place to troubleshoot or support the code in the book. Please open a new question or send an email to the contact email.

ADD REPLYlink written 4 months ago by Istvan Albert ♦♦ 71k

Can you post the output for ls -l ~/bin/wonderdump and which wonderdump?

First two commands you have above should "install" the wonderdump script.

ADD REPLYlink modified 4 months ago • written 4 months ago by genomax27k

The outpuf of ls -l ~/bin/wonderdump is

-rwxrwxrwx 1 AR AR 1057 Jan 5 11:15 /home/AR/bin/wonderdump

Is there anything wrong?

ADD REPLYlink modified 4 months ago • written 4 months ago by ahramkim11280
0
gravatar for mcc
4 months ago by
mcc70
PVD, USA
mcc70 wrote:

Has Biostar now become a sales and advertising platform for 'for-profit' books and guides?

Is Biostar the proper forum for a book, that by the sound of it, is based on the free content that is found without charge in this educational platform. Although, I applaud the efforts of the writers and editors of this book to aggregate the material for the book, I am questioning the idea of having a booklet that is a pet project for some being advertised on such an open source environment.

ADD COMMENTlink modified 4 months ago • written 4 months ago by mcc70
6

nah, it is not books and guides - just one book - the real deal, the Biostar Handbook. :-)

and it is not "for profit" either - it is about "loss reduction" mainly ... running this site over the past seven years has generated costs that go well over five digits ( (and in US dollars that is), or even six if you count people's time. I am thrilled to have come up with a solution that does several things at once: supports the site, imparts valuable knowledge, helps the people that need this knowledge and does that really cheap.

What you seem to complain about, that unregistered users may be shown ads a few times is a very small inconvenience.

Finally if you think that it is easy to write this book - you are welcome to try it yourself, then when done release under any licensing of your choice.

ADD REPLYlink modified 4 months ago • written 4 months ago by Istvan Albert ♦♦ 71k
1

Um, you're welcome to continue using the site and its content gratis, so it's unclear what your beef is...

ADD REPLYlink written 4 months ago by harold.smith.tarheel3.5k
0
gravatar for Diego
4 months ago by
Diego40
Diego40 wrote:

This is amazing! Coming from software engineering, I'm pretty new to this "bio" part of bioinformatics. I'm going to buy it as soon as I can,

ADD COMMENTlink written 4 months ago by Diego40
0
gravatar for reza.jabal
3 months ago by
reza.jabal230
United Kingdom
reza.jabal230 wrote:

Well done Istvan for this great job. It's absolutely a valuable resource for current and future students! but why not delivering a course on Coursera or edX based on this book!

ADD COMMENTlink written 3 months ago by reza.jabal230
2

The book us used to support an existing course offered in residence and based on student evaluation it was successful at that.

An online course based on the book is indeed a more natural and logical step - we'll see what the future holds. I am not particularly fond of the Coursera platform - I find it counterintuitive as far as the interface goes, difficult to learn with it and feels limiting in with respect of course design.

ADD REPLYlink written 3 months ago by Istvan Albert ♦♦ 71k
0
gravatar for fanicesiza
3 months ago by
fanicesiza0
fanicesiza0 wrote:

In the Gene ontology section of this book, there is no file in the locate: http://geneontology.org/gene-associations/gene_association.goa_human.gz Please check, thanks.

ADD COMMENTlink written 3 months ago by fanicesiza0

Thanks for the note. It turns out that on January 8th, 2017 the Gene Ontology has changed not just the file naming but even the formats. Creating a link and results checker for the book is more urgent than ever. This will the priority this month.

I always knew that data and links are brittle in bioinformatics, but the magnitude and frequency of the problems surprised me. Just two months after the book' release and a whole bunch of links had to be changed.

The URL in question has been corrected and the examples will execute but I will need to revisit that chapter and investigate the matter a little deeper as I believe that there are more profound changes there.

By the way I also expect a whole slew of GO enrichment tools to become broken as the result of this change.

A new site for issue tracking for the book will be added soon. Details to follow.

ADD REPLYlink modified 3 months ago • written 3 months ago by Istvan Albert ♦♦ 71k
0
gravatar for Lila M
3 months ago by
Lila M 230
UK
Lila M 230 wrote:

The book is great, but I would like to have some information related to ChIP- seq as well

ADD COMMENTlink written 3 months ago by Lila M 230
3

The ChIP-Seq chapter is scheduled for the end of this month (early March latest).

ADD REPLYlink written 3 months ago by Istvan Albert ♦♦ 71k

looking forward it!

ADD REPLYlink written 3 months ago by Lila M 230

I am still waiting for it! :D

ADD REPLYlink written 8 weeks ago by Lila M 230
2

It will come a bit later - hopefully this month.

We hit some issues - mostly that ChIP-Seq studies appear to be even more subjective and sensitive to parameters than initially thought. Hence we are contacting authors and trying to work with them to figure out how they got to the conclusions.

ADD REPLYlink written 8 weeks ago by Istvan Albert ♦♦ 71k
2

The ChIP-Seq chapter is now out.

ADD REPLYlink written 6 weeks ago by Istvan Albert ♦♦ 71k

I'm reading it! Thank you very much :)

ADD REPLYlink written 6 weeks ago by Lila M 230
0
gravatar for juliuspasion1994
3 months ago by
juliuspasion19940 wrote:

I just purchased the book, but I have Windows 7. Can anyone help me download the right programs needed to work through the book that are available on Windows 7?

ADD COMMENTlink written 3 months ago by juliuspasion19940
1

This thread is not the right forum for this question. Please read the first paragraph in the first post.

ADD REPLYlink written 3 months ago by Istvan Albert ♦♦ 71k
0
gravatar for kenanh
9 weeks ago by
kenanh0
kenanh0 wrote:

Great book!

In the section How to visualize genomic variation (What would realistic and good data look like?) it says that the script simulate-experimental-data.sh will generate a file called results.bam. It actually generates a file called align.bam.

In the section Variant effect prediction (How do I use snpEff?) the link http://data.biostarhandbook.com/variant/find-ebola-variants.sh results in a file not found error. Please check, thanks.

ADD COMMENTlink written 9 weeks ago by kenanh0

Please submit issues to:

https://github.com/biostars/biostar-handbook-issues/issues

In this case I have opened an issue for it:

https://github.com/biostars/biostar-handbook-issues/issues/10

Edit: the issue has been now resolved

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by Istvan Albert ♦♦ 71k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1269 users visited in the last hour