News:The Biostar Handbook. A bioinformatics e-book for beginners.
### April 2022: Biostar Workflows

A new book has been released in the Biostar Handbook series:

The book presents bioinformatics automation in the context of modular makefiles. Several published analyses have been documented and solved via the modular structure.

The Biostar Handbook collection now includes five volumes of straight-up, no-nonsense, data analysis in the trenches!

### March, 2020: Coronavirus Genome Analysis

A new book has been released in the Biostar Handbook series:

The book introduces readers to the practical aspects of investigating data from a viral outbreak.

### January, 2020: RNA-Seq by Example

A new book has been released in the Biostar Handbook series:

A step by step guide through the process of performing an RNA-Seq data analysis.

As always all new content is included with the subscription.

### November, 2019: The Art of Bioinformatics Scripting

The Biostar Handbook has grown huge :-) it is now close to 1000 pages! To manage this complexity we have started reworking the various chapters into independent books.

The reorganization will allow readers to more easily locate the information that they need. It will allow us to design and formulate specific training plans that are customized to specific needs. The first book section that has been reworked covers Unix Scripting and is titled:

As always all new content is included with your subscription.

### January 2019: Biostar Handbook 2nd Edition, New Online Course

A little more than two years after the launch, the Biostar Handbook gets a rewrite. The 2nd Edition is a complete rework, every section, chapter and page will be edited, expanded and modernized. A new course has also been launched as part of the book:

The Learn Bioinformatics the Right Way course is included with the book at no extra charge. The course is well-suited as an introduction to this field of science, or to brush up on topics you have learned before.

### December 1st, 2016: Original Announcement

Announced almost 18 months ago, the Biostar Handbook has now been published. It delivers simple, concise, and relevant information for those looking to understand the field of bioinformatics as a data science.

It is a comprehensive, practical handbook that aims to cover (though it is not quite there yet) all major application areas of bioinformatics.

Special thanks go to Biostar users GenoMax , shenwei356 and Jeremy Leipzig who have contributed entire pages or sections to the book.

Only now that the book released - as I am looking at 713 pages of do I start to realize just how big Bioinformatics has gotten in the past few years. And we're still missing entire subdomains of it: Metagenomics, Assembly, ChIP-Seq. But fear not we'll handle those too in this coming year.

Spread the word, let others know - I think there is no other resource like it. I like to call it data analysis with attitude, where reproducibility means not following letter by letter, but doing it better, faster and simpler.

Let me invite anyone that wishes to contribute to do so. It is easy, and simple, Markdown based publishing. And there is so much more that could be done and will be done. Be a part of it! We are independent, self published, self supported. Chart your own course, bring your own ideas and goals to fruition or just enjoy being a part of a creative process.

Please join me in Congratulating @Istvan on this special day!

This will hopefully be THE bioinformatics information resource for both students and instructors (and anyone else who may be just curious). Looking forward to continued collaboration.

On thing I forgot to mention - everything in this book works on Windows as well! The new Bash for Windows subsystem worked amazingly well. Only fastq-dump had problems connecting.

I had been in contact with the Microsoft Genomics team and they reached out to the Bash Team - sure enough they put in a fix just to make fastq-dump work for us. This will be pushed out in some of the updates to OS.

I was very impressed how well it all worked out on Windows as well.

May I know the way of contacting withthe Ubuntu Microsoft people? I am aware of some problems with another tools, such us fastx-toolkit

I have a personal contact that put me in touch so don't feel comfortable sharing that information.

As for the fastx-toolkit I would recommend not using it - it is not the right tool anymore.

I am not a fan of fastx-toolkit. It cannot work with more than one threat and it is slow.

But the point is that some error is there avoiding this program and maybe some other programs to work nicely with Window-Linux. I haven't tried the latest version, though..

what problems are you having with fastq-dump ? it works well on WSL in the insider preview windows version

Congratulations! So nice for you to make money from open source community efforts.

I understand the sentiment and why you would post this. I thought of this myself as well. The book even has a FAQ titled: Why isn't the book free? - in a nutshell, creative process is fun, cleaning up after other people's creative processes is hard work.

I understand the sentiment and why you would write such a FAQ, but I just do not believe you make the world a better place by selling information. The Internet is full of people who create things for free, no matter how much work they put in. However, it is not about the money you ask, it 's about using the name of a open source community (where a lot of people are willing to spent their precious time on helping others) to sell you product.

Content of the book is mainly based on a graduate course @Istvan has taught at Penn State for last few years. AFAIK content of the book is freely accessible over the net (you only need to create an account). Only those wishing to obtain a PDF/eBook copy are expected to make a purchase. Rationale for that decision is described in the FAQ.

Disclosure: I did contribute some content to the book under the licensing described here without any expectation of financial gain.

So why call it the BioStars handbook and not the @Istvan Penn Stat course handbook? My guess is to make more money.

And I tried to make an account just now but I need to pay for the book to get one.

My apologies. I thought the online version was "free to access" but I was wrong. I have made a correction to my post above.

No sir, information will cost you $. ADD REPLY 10 Entering edit mode I am contributing to the initiative - both time and money. I've paid for the book and I'm going to be working on the content. To me, it's not about the money.$30 for 2 years' access is peanuts. I've always admired Istvan's efforts in building this community and a site that multiple communities could tailor to their use case.

I have a philosophy that I follow for this free/paid thing, especially for my favorite websites, YouTube channels and webcomics. If someone gave me access to their work for free, and they're selling an offshoot, I'll pay for it gladly. I want them to make a profit with their skills, I want to benefit from their skills - I don't want them to stop because it gave them no return.

If Istvan had asked for $10 a year to help maintain the site, I'd have paid it gladly. ADD REPLY 3 Entering edit mode Congratulations @Istvan. About to buy the handbook. Looking forward to having this on my desktop! ADD REPLY 1 Entering edit mode Thank's everyone for the encouragements. Proofreading is my bane - the hardest part for me to do for this book. So there are plenty of awkward sentences - but just about every day I push out more fixes to wording. ADD REPLY 2 Entering edit mode Now I could recommend my biology friends a one-stop solution to begin with bioinformatics. Thanks a lot Istvan (and others), for the efforts behind this book and ideas. I wish this was published when I started Bioinformatics.I am eager to read this book and learn more Bioinformatics. ADD REPLY 2 Entering edit mode Congratulations, stellar book with great content and elegant design. Will be ordering a print version - hope I can get a signed copy :) ADD REPLY 3 Entering edit mode No print version as of yet though, but come to think of it I can sign PDFs! ADD REPLY 1 Entering edit mode Wonderful, congrats! ADD REPLY 1 Entering edit mode Great resource, bought a copy and have been reading through it all day, lots of useful tips and explanations. Well done everyone involved. ADD REPLY 1 Entering edit mode Thanks James Ashmore we will keep at it, we will expand it - I predict doubling the content by Summer and always "keeping it real". ADD REPLY 1 Entering edit mode hello I've been using Biostars for 1 year so i bought the book , i will like to add if it is possible to add chapter about small RNA and long non coding RNA and how to analyze their data. also is it possible to add tips how to deal with non-model organism and how to annotate genome based on transcriptome. congratulation for this nice book. ADD REPLY 1 Entering edit mode Early next year we will send out a questionnaire for the readers to ask them what they want covered. And we will try add if not all but most of the topics people they want covered. The Biostar Handbook is not the typical book where we put it out there, then walk away. That's really why I am doing this - I want a "living" book in a different kind of publishing system one that works like software itself. It adapts and changes with time, you can upgrade or use the prior versions. ADD REPLY 1 Entering edit mode Regarding licensing, would you entertain the option of an institute-level license? For example, we don't really need 5 copies of the book, but a single copy with either an institute-level account or allowing multiple accounts from the same institute would be really nice. ADD REPLY 1 Entering edit mode Everything is possible since I chose self-publishing :-) Please send me an email and we can discuss this further that way. ADD REPLY 1 Entering edit mode Will do, thanks! ADD REPLY 1 Entering edit mode As a Bioinformatician with a short experience, I recognized myself very well in the short paragraph about stress and anxiety for the only analyst on a project. Just for this part I can say that the book reflect quite well the reality of the practice. I bought it to learn more about domain in which I am not really familiar, but I am pleased to discovered things about the basics of bioinformatics that will help me in the future (the iteration over a stream open by cat to build a command). My only grief is the change to the bash_profile in the "how to set up my computer ?" that turns my terminal to be ugly as hell and make me freak out for a minute ;p You should add a warning about this. ADD REPLY 1 Entering edit mode Thanks for the feedback. For what is worth, I think that BASH prompt is best that I found - though I may have not properly explained why you'd want that information. I will be updating that section to include the following: The bash prompt should give you information on the context that you are working on. Initially, when you only work on a single system and in a single directory the information seems unnecessary. As soon as you deploy across different systems it helps you avoid making the biggest mistakes you can make: running a command on the wrong computer or the wrong directory. You can do some lasting damages that way .... :-) most bioinformaticians have war stories to tell you about that ... Thus the prompt is set up to continually, even obnoxiously, remind you of the following: 1. What computer and what user are you logged in as 2. What directory are you currently in In addition, it opens a new line to avoid pushing the prompt too far to the right, so long commands will still fit without wrapping cherish this information :-) on day it will save your bacon ADD REPLY 0 Entering edit mode Totally agree on the "be conscious of the computer, directory and user" point. Often times, you'd be working on a dev and a production server and you'd need to do some major operation on the dev server (such as clean up all logs) that would, if executed on the production server, wreak havoc. Imagine managing an AWS server and having two logins, one with root privileges and the other a regular account. You're in a folder that has a sub-folder named dev and you wish to run rm ./dev. You type the command, but make a typo and miss out the .. It only takes a small coincidence to delete all the information you have on the server. Even if one were super cautious to work without this "insurance", if not for this heads-up information, they'll find themselves running pwd at least every other minute (best case scenario). ADD REPLY 0 Entering edit mode There is a warning, although it definitely is not prominent. The line after the curl commands points you to the "Setting the bash profile" page, which shows the exact file being used. Also, one should never trust content from the internet implicitly. Always read the file you're source-ing before you source it. I don't understand how it turned the terminal "ugly as hell", by the way. The statement that alters the shell is the PS1 setting, and it adds some color to the display. What is your recommendation for warning the user about this change? ADD REPLY 1 Entering edit mode There are now 4 separate books in total, however, all the links lead to the index page of The Biostars Handbook so it's impossible to check what's inside each of them. I'd like to know what topics does the "Coronavirus Genome Analysis" cover. Could you please fix it? ADD REPLY 0 Entering edit mode This is a good suggestion. If it's not too much of a hassle, the index on the left should reflect chapter titles and sub-titles (maybe redacted a little) so people know what they're getting. ADD REPLY 0 Entering edit mode This particular book needs to be rewritten as NCBI has recently changed the way they distribute the data, hence invalidated a section that dealt with getting the data (genomes, sample metadata etc). The book will be rewritten with viral genome analysis in mind - a general topic, how to obtain and analyze viral sequences in general (the example applications to SARS COV 2). ADD REPLY 0 Entering edit mode I have opened a new thread to channel discussions on the book there and keep this more focused on the announcement itself. How to get Biostar Handbook for free. And participate in building a better educational platform. ADD REPLY 0 Entering edit mode Thanks! this is an excellent and relevant resource. I was wondering what was your plan next. Will you add more softwares examples like PLINK! or GEMMA? Like more info on how to run a basic molecular ecology problem using GWAS? ADD REPLY 1 Entering edit mode The current plan is to cover various domains of applications over the next year. ChIP-Seq, small RNA and so on. We're interested in covering other software like PLINK as well. ADD REPLY 0 Entering edit mode Many congratulations on publishing the book, yet another feat you (other contributors) have achieved after the Biostars.org itself. I have advertised book to many people already who ask me on a daily basis how to start in Bioinformatics, clear, concise and to the point. Would love to contribute to the future episodes. xx ADD REPLY 0 Entering edit mode Congrats on your book Istvan!!!! Always love it!!!! :) ADD REPLY 0 Entering edit mode I think this was the best money I spent in some time. As someone who was totally unfamiliar with command line, I find that many guides assume too much baseline knowledge about how it works and move straight on to the tools. This book does a great job of introducing you to both at the same time. I might not be an aficionado yet, but I have made a lot of progress in a short time, and am gaining confidence in working outside of the security of the book's instructions. Thank you to all the authors for their continued efforts! ADD REPLY 0 Entering edit mode Thank you for sharing your experience. I am always thrilled to hear that the book "works" in practice :-) ADD REPLY 0 Entering edit mode thank you for sharing your experience but I can't login to read the book!! Is biostar username and password valid? ADD REPLY 0 Entering edit mode Did you receive the credentials in your email? I'm assuming you already paid for it. ADD REPLY 0 Entering edit mode The book is independent of this website and is not a free resource. This is where you can purchase the book: https://biostar.myshopify.com/ ADD REPLY 0 Entering edit mode whats the difference in the content of the 25$ and 35$handbook? I want to buy one nd dont know what the dfference is. ADD REPLY 1 Entering edit mode It clearly mention in the picture: The$25 gives you access to the book for 6 months; the $35 gives you access for 2 years. ADD REPLY 0 Entering edit mode I know all that. Im talking about the content. whats inside? I am just wanting to know if the content is the same. thats all. ADD REPLY 0 Entering edit mode There is only one version of the book as far as I know. So the content should be the same. ADD REPLY 0 Entering edit mode It's a little difficult to find, but the book is web-based and is continually updated with the latest and evolving best practices in the field. That's the reason the sales model behind the book is time-based and not content or copy-number based. The idea is that when you wish to work on/learn a particular technique, you'd learn the current best practice and not outdated tech. It's a continuous effort from the team behind the book, and that's why it's a per-block-of-time cost. ADD REPLY 0 Entering edit mode The content is the same at any given time point for both editions. Moreover, readers may keep the last PDF and eBook version they have access to. Thus student edition owners would get access to the version released six months from now as well. What they would not be able the access is the website and the new edition say seven months after their purchase. Hence the differences will only appear in time, as new content gets added. For a list of new content that was added: https://www.biostarhandbook.com/public/updates.html Plus in the past year, an online course was added with another to follow. These are all web-based, so access to these are limited by the subscription terms. ADD REPLY 0 Entering edit mode What is the best way to report typos in the handbook? shall I open a gihub issue? Any specific format for the same? ADD REPLY 0 Entering edit mode Create an issue: https://github.com/biostars/biostar-handbook-issues/issues If there are multiple include them in a single post. ADD REPLY 0 Entering edit mode Looks amazing, I will probably get this to get everything on a single book before starting my PhD. May I ask if you plan to do something on ATAC and Hi-C as well? What about sc Sequencing? It's already huge, I just hope to see those features in future. Great job and keep going! ADD REPLY 1 Entering edit mode Single cell sequencing is a topic I would like to add - but the field feels quite un-settled: for example, which RNA-Seq should we use? ... fifty publications later, well well well maybe we should use the same as regular RNA-Seq... really? hard to believe... All content in the book comes from personal hands-on experiences - hence I need to get involved a little more myself. ADD REPLY 0 Entering edit mode @Istvan Albert How can i get This book Plz...??? ADD REPLY 0 Entering edit mode The link is in the post :-) https://biostar.myshopify.com/ ADD REPLY 0 Entering edit mode I bought the course last month and it's fun and informative. Thank you for updating the content. ADD REPLY 0 Entering edit mode Hi, I am trying to get my college to buy this book and courses for bioinformatics students. I'd like to know please: 1. The Bioinformatics Data Analysis course mentioned above is not mentioned on the site when buying the books. Is it still included with every purchase? 2. The student pack doesn't mention access to the Python course. Is that not included with the students pack? 3. For further questions in this regard, who can we contact? Thanks. ADD REPLY 1 Entering edit mode While @Istvan will likely respond, you should email contact at biostarhandbook.com with questions. ADD REPLY 1 Entering edit mode As genomax points out the right email is contact@biostarhandbook.com. A short answer is that everything is included. ADD REPLY 0 Entering edit mode Thanks for your fast replies ADD REPLY 0 Entering edit mode Hi, Some small things I found that need to be corrected. In chapter "common data types", the first paragraph ends incompletely. Also, in the same chapter the title "Is there a list of “all” resources?" is repeated from the previous chapter. ADD REPLY 0 Entering edit mode Please use the issue tracker for biostar handbook for to create a new issue for these corrections. ADD REPLY 0 Entering edit mode A nice book. Thank you Istvan! ADD REPLY 0 Entering edit mode Is The Art of Bioinformatics Scripting different from the Biostars Handbook ? ADD REPLY 0 Entering edit mode Yes. It is a new book that is included as a part of your subscription to Biostars Handbook. ADD REPLY 0 Entering edit mode Yes, as genomax points out, it is included with the Biostar Handbook. The content in the *The Art of Bioinformatics Scripting * started out as a chapter in the Biostar Handbook, but then in time it has grown enough to warrant moving it into a separate book. ADD REPLY 0 Entering edit mode @Istvan Albert how to get such books please ?? ADD REPLY 1 Entering edit mode You have to purchase access. It is subscription based. All materials are included in one subscription. ADD REPLY 0 Entering edit mode @genomax well, so please share the subscription link ADD REPLY 0 Entering edit mode It takes less effort to click on any handbook link on the top level post and the click the "Get access to the book" green button than it takes to ask others to do the work for you and give you a direct link. The first rule of seeking help anywhere is to invest some effort yourself. ADD REPLY 0 Entering edit mode @Istvan many congratulations for this great resource. Wishing you a great success. ADD REPLY 0 Entering edit mode Hi, I'd like to buy the biostar handbook with title "Corona Virus Genome Analysis". However, I do not have the payment methods listed in the link. Is there a way to get around then? Any comments are greatly appreciated. ADD REPLY 0 Entering edit mode There is a single "subscription" price for the "suite" of books/training materials. You have to buy that subscription. ADD REPLY 0 Entering edit mode Could you kindly let me know how to proceed ? ADD REPLY 0 Entering edit mode Buy the book at this link. ADD REPLY 0 Entering edit mode I tried this earlier and did not buy because I do not have the payment methods... ADD REPLY 0 Entering edit mode @Istvan Albert There should be a dedicated part in the upcoming version of the book for sequencing analysis results visualization, which covers circos, heatmaps, and other visualization methods. ADD REPLY 0 Entering edit mode Great! Thanks for the book! ADD REPLY 0 Entering edit mode Wonderful, congrats!! ADD REPLY 0 Entering edit mode First thing I read was section on "how not to waste your time". I agree totally wrt Docker and push button type cloud compute environments. The CWL section was hilarious. ADD REPLY 0 Entering edit mode This will hopefully be THE bioinformatics information resource for both students and instructors (and anyone else who may be just curious). Looking forward to continued collaboration. ADD REPLY 8 Entering edit mode 3.1 years ago A huge metagenomics project in my Lab failed just because my boss refused to employ a bioinformatician. Of course I didn't know anything in bioinformatics apart from the word itself, so I couldn't save the project. Then I decided, even if it means life or death, am gonna settle down and learn bioinformatics. Yes, I got started, and day by day, my confidence is growing steadily thanks to "The Biostar Handbook" which I purchased 3 months ago. I think we need to add something like "Teach Yourself Bioinformatics" to the title of this book, because for sure, that's what am doing with this book. I would highly recommend it for anyone passionate about starting a career in bioinformatics. And I must say that I would even pay more for it because the information in there is not just worth those peanuts you pay to get the book. It's worth a career! And now there is also this huge Biostar community with answers to almost every question I as a starter may have, and you get these answers instantly as if there is a robot sitting on the other side with answers ready to respond to your questions. This gives you the feeling that not matter what, you will always get the help you need. Then there are these guys complaining that the "book should have been for free", and those claiming that the "Biostar community is diminishing the power of Bioinformatics cores". Come on guys, give us a break. You buy a book for only and only$35 and enjoy all those privileges for two good years and you can't say thank you? This is ridiculous!

Glad to hear that it worked out for you.

Adding the subtitle that you suggest is a great idea - I do think that the book is an efficient way to learn bioinformatics from first principles. The second edition is now getting close to release - I'll consider adding the "Teach Yourself Bioinformatics" as a subtitle.

You are also right on the money with the observation that the book teaches skills that can form the basis of a career!

Thanks again for the kind words and welcome to Biostars!

5.4 years ago
jiwpark00 ▴ 210

Congratulations! I've been using Biostars for past 6+ months so I just bought the book. I'm intermediate-savvy with bioinformatics (I can usually figure out what to do if I search on Biostars or Google but I don't know all the codes off my head) so this book will be great in getting my skills up-to-date.

I already like that you included a section of Unix commands - that's something a lot of bioinformatic workshops don't go in enough depths about.

What would be the best way for contributing? Does GitBook work like Github? I think it would be great if there is a section on maybe WGCNA (or at least mention it). Maybe this is not applicable to everyone but majority of people I've talked to (and many are wet-lab biologists) really want to be able to take RNA-seq data and put into WGCNA but tutorials available are pretty abstract (UCLA tutorial requires some experiences, for instance).

Overall an amazing tool! It's like an early Christmas gift :-)

I am planning to do a tutorial and vignette of WGCNA later on after refactoring a little bit the code. I am adding test and updating the package in this github repository https:github.com/llrs/WGCNA.

7
Entering edit mode
5.3 years ago

There is a GitHub repository that tracks issues, problems and suggestions for the book.

If you experience any type of technical problem please visit the site below to create a new issue.

https://github.com/biostars/biostar-handbook-issues/issues

5.4 years ago
Sentinel156 ▴ 180

I've been following this for the last couple of months and really glad to see it released! It was an instant buy for me as a wet-lab phd student interested in bioinformatics. I was wondering if it is possible to supply an epub version alongside mobi for those without kindle e-readers? Also is there a plan to perhaps have an online repository on github or similar for code examples and workflows from the book? This could be similar to Vince Buffalo's Bioinformatics Data Skills book. Overall I can't wait to get stuck into reading it, i'm sure it will save me and many others hours of time searching for help online, so congratulations and thanks again to all involved for all of the hard work putting this together!

Great idea on the github repo. We'll do it - though might take until next year to deploy

Oh indeed my mistake I thought mobi will work on both. I can and will generate epub formats as well.

I am traveling ATM so I might not be able to do it until later tomorrow.

5.3 years ago

I am the first person in my lab to attempt bioinformatics analysis using command line (with zero training). This book has been extremely helpful, covering some important basic concepts and moves on the complex & practical analysis.

The most brilliant thing about this book, is that it is unbelievably fun to read! People in the office caught me laughing out loud while 'studying' bioinformatics several times. The author has a very interesting character! He is passionate about what he does, while occasionally pokes fun of the irrational reality. It really feels like having a tutor telling me things beyond the textbook. Thank you for making me clueless attempt surprising enjoyable.

Thanks for the feedback. Much appreciated! I was hoping to make learning more enjoyable and fun.

And, as you point out it was indeed one of my goals to get people to see the some of the idiosyncratic aspects of bioinformatics. With no doubt the field is home to geniuses when it comes to software development - yet at the same time a suspension of disbelief is necessary to deal with the emerging complexity caused by going forward too fast.

Laughing at it (or with it) is how we deal in our group.

2.2 years ago

Hi Istvan! Thank you for this book. In the light of the current situation, I have started using the book in my MSc level course in bioinformatics. The students are now working through the examples and they have so far managed to follow the book quite a bit. We found that this approach also provides a viable practical exercise in using bioinformatics tools and genome analysis. A few points that came up:

• We think that the book should have an explicit statement on authorship, we think it was written mainly by Istvan but should maybe list all contributors.
• It is not clear how to cite these books. Would it be possible to assign DOIs and maybe assign an ISBN, that way, it might be easier to cite and also for a university library to buy a license.

I think these books have unparalleled utility in applied bioinformatics and making them more accessible might be worth it.

If you are referring to the corona virus book then that was probably entirely written by @Istvan.

0
Entering edit mode

Thanks for the feedback Michael. I would recommend citing the book by title and URL.

The majority of the book is original content that I wrote. Others have assisted along the way, and there is a list of contributors on this page (after the main intro)

The sections/chapters that were primarily written by others are marked as such. For example:

https://www.biostarhandbook.com/using-the-david-server.html

or:

https://www.biostarhandbook.com/ming-tangs-guide-to-chip-seq-analysis.html

Contributions by others that mostly consist of corrections and bringing up existing code to original intent were not explicitly marked other than listing the people in the contributor list.

Getting DOI and ISBN numbers are somewhat bureaucratic processes that I've postponed until now. I will look into that. In general, I recommend online access rather than using an ebook or PDF. The web version is updated very frequently, getting the up to date PDFs are very tedious.

Hello and Happy New Year! After having used the CVGA book with very good results last year I would like to repeat using the book for a practical session again this spring semester. I particularly liked that one could hand out the document for self-study and hands on and afterwards the students presented their results in class. This helped me dramatically in the transition from physical to online classes. Now, the book contains a statement that it needs to be updated. I am also wondering how to include the new variants into the analysis, e.g. B.1.1.7.

Do you have an expected date when that would be ready? I am planning to have the hands on session in March or April. If not, we could have an assignment to contribute a piece. In my understanding it is the data retrieval section that needs updating.

Thank you very much,

Michael

Yes, this is my priority as well, I will do my best to rework the Coronavirus chapter by mid-February.

The plan is that by the end of the month the RNA-Seq book will gain a chapter, the "Grouchy Grinch RNA-Seq", where I cover data where RNA-Seq analysis can go radically off-track and what to do about it: antisense transcription, overlapping transcription, transcript integrity etc. This chapter is well on its way of being written and will be finished by the end of the month.

After that the focus shifts on updating the Coronavirus book and bringing it up to date to the current state of data and knowledge.

Great! I have now scheduled a hands-on week in April (9.-15.) but am keeping it flexible. Hopefully, we can also contribute something this time, and if it is only thorough testing.

5.4 years ago
Daniel ★ 3.9k

This looks great, impressed to see chapters dedicated to things that you can only usually get a handle on from hours or days of google-fu (was looking forward to checking out sratools, entrez and bioawk).

EDIT: I originally had a comment on empty sections, but I'm throwing it out because it's totally misleading. I was totally wrong, and as Istvan pointed out I've skipped straight over the section on analysis and inadvertently found myself in the tools section.

I am not sure I follow. RNA-Seq has detailed sections in the book. There are 12 sections on RNA-Seq with very detailed instructions on how to perform the analysis. There are five different fully worked out examples from evaluating controls samples to two ways of doing the Zika analysis. It all starts here:

The tool chapters as just how to install them and perhaps on how to test.

I'm really sorry, as you've pointed out, I've scrolled through to the tool pages in the pdf without realising that I've passed over the instructional and data analysis sections. I'm going to edit my comment now.

A couple of suggestions would be maybe to rename "Using Tophat2", "Using Kalisto" to "Installing Tophat2", "installing Kalisto" etc, and maybe introduce a hard page break into the contents pages (Maybe number them appendices?) as I didn't realise that the sections had changed when scrolling through 10 contents pages.

It's my fault for skimming quickly though. Significantly more than I saw on first pass.

1
Entering edit mode

This is a good point - that naming is confusing I will change it and push the new version.

The naming has been changed to "Install" - new pdfs are available as well.

The same applies for every tool. The book is not tool centric - it is task centric.

We never want to use just use seqtk we always have problems that can be solved with seqtk.

One would only visit the dedicated seqtk page only when installing it first or to verify some limitation and problem that this tool might have. That is why each tool has a separate page - and sometimes that page is somewhat empty when the tool does not have many issues affecting it . Still worth having it on its own page as it makes locating it very easy via search. Just type seqtk in the search box and you'll get the pages

5.4 years ago
ahramkim1128 ▴ 10

In https://read.biostarhandbook.com/ontology/sequence-ontology.html page, URL=https://raw.githubusercontent.com/The-Sequence-Ontology/SO-Ontologies/master/so-xp-simple.obo curl \$URL > so.obo doesn't work. There is no file in the URL. Please edit this in the next update version. :)

Please try this URL instead. We will correct it in the book.

Goes to show just how links "rot" within a month or two. Starting with next year we will do nightly builds of all code in the book - I now realize that the book itself needs to be treated as if it were software - needs to be re-built every day (or perhaps every hour)...

The link has now been fixed in the web version and will make it into the PDF versions once these are updated.

wonderdump doesn't work. I did

1. curl http://data.biostarhandbook.com/scripts/wonderdump.sh > ~/bin/wonderdump
2. chmod +x ~/bin/wonderdump
3. wonderdump SRR1972739 -X 10000 --split-files

and I saw "wonderdump: no command for wonderdump."

Can you post the output for ls -l ~/bin/wonderdump and which wonderdump?

First two commands you have above should "install" the wonderdump script.

The outpuf of ls -l ~/bin/wonderdump is

-rwxrwxrwx 1 AR AR 1057 Jan 5 11:15 /home/AR/bin/wonderdump

Is there anything wrong?

this is not the place to troubleshoot or support the code in the book. Please open a new question or send an email to the contact email.

5.3 years ago
reza.jabal ▴ 520

Well done Istvan for this great job. It's absolutely a valuable resource for current and future students! but why not delivering a course on Coursera or edX based on this book!

The book us used to support an existing course offered in residence and based on student evaluation it was successful at that.

An online course based on the book is indeed a more natural and logical step - we'll see what the future holds. I am not particularly fond of the Coursera platform - I find it counterintuitive as far as the interface goes, difficult to learn with it and feels limiting in with respect of course design.

5.3 years ago
fanicesiza ▴ 10

In the Gene ontology section of this book, there is no file in the locate: http://geneontology.org/gene-associations/gene_association.goa_human.gz Please check, thanks.

Thanks for the note. It turns out that on January 8th, 2017 the Gene Ontology has changed not just the file naming but even the formats. Creating a link and results checker for the book is more urgent than ever. This will the priority this month.

I always knew that data and links are brittle in bioinformatics, but the magnitude and frequency of the problems surprised me. Just two months after the book' release and a whole bunch of links had to be changed.

The URL in question has been corrected and the examples will execute but I will need to revisit that chapter and investigate the matter a little deeper as I believe that there are more profound changes there.

By the way I also expect a whole slew of GO enrichment tools to become broken as the result of this change.

A new site for issue tracking for the book will be added soon. Details to follow.

5.3 years ago
germelcar ▴ 20

Is correct to try to cite the handbook? If yes, how would I cite it?

BWT, what about de novo transcriptome assembly? Would be great to include a section for that.

For fastq-dump, I see that the Handbook uses version 2.5.2. I remember a notice from SRA-Toolkit's github page saying that one should upgrade because of the https update on the NCBI platform. I see also that the "--split-files" option for paired-end data is used, what about "--split-3" option? I have noticed some subtle differences (and better results for using Trinity) when I use "--split-3" option instead of "--split-files" option.

Create an "issue" for the fastq-dump question here.

1
5.2 years ago
Lila M ★ 1.1k

The book is great, but I would like to have some information related to ChIP- seq as well

The ChIP-Seq chapter is scheduled for the end of this month (early March latest).

looking forward it!

I am still waiting for it! :D

It will come a bit later - hopefully this month.

We hit some issues - mostly that ChIP-Seq studies appear to be even more subjective and sensitive to parameters than initially thought. Hence we are contacting authors and trying to work with them to figure out how they got to the conclusions.

The ChIP-Seq chapter is now out.

I'm reading it! Thank you very much :)

5.2 years ago

I just purchased the book, but I have Windows 7. Can anyone help me download the right programs needed to work through the book that are available on Windows 7?

This thread is not the right forum for this question. Please read the first paragraph in the first post.

5.2 years ago
kenanh ▴ 10

Great book!

In the section How to visualize genomic variation (What would realistic and good data look like?) it says that the script simulate-experimental-data.sh will generate a file called results.bam. It actually generates a file called align.bam.

In the section Variant effect prediction (How do I use snpEff?) the link http://data.biostarhandbook.com/variant/find-ebola-variants.sh results in a file not found error. Please check, thanks.

https://github.com/biostars/biostar-handbook-issues/issues

In this case I have opened an issue for it:

https://github.com/biostars/biostar-handbook-issues/issues/10

Edit: the issue has been now resolved

2.5 years ago

### November 1st, 2019: The Art of Bioinformatics Scripting

The Biostar Handbook has grown huge :-) it is now close to 1000 pages! To manage this complexity we have started reworking the various chapters into independent books.

The reorganization will allow readers to more easily locate the information that they need. It will allow us to design and formulate specific training plans that are customized to specific needs. The first book section that has been reworked covers Unix Scripting and is titled:

As always all new content is included with your subscription.

2.3 years ago

5.4 years ago
mcc ▴ 80

Has Biostar now become a sales and advertising platform for 'for-profit' books and guides?

Is Biostar the proper forum for a book, that by the sound of it, is based on the free content that is found without charge in this educational platform. Although, I applaud the efforts of the writers and editors of this book to aggregate the material for the book, I am questioning the idea of having a booklet that is a pet project for some being advertised on such an open source environment.

nah, it is not books and guides - just one book - the real deal, the Biostar Handbook. :-)

and it is not "for profit" either - it is about "loss reduction" mainly ... running this site over the past seven years has generated costs that go well over five digits ( (and in US dollars that is), or even six if you count people's time. I am thrilled to have come up with a solution that does several things at once: supports the site, imparts valuable knowledge, helps the people that need this knowledge and does that really cheap.

What you seem to complain about, that unregistered users may be shown ads a few times is a very small inconvenience.

Finally if you think that it is easy to write this book - you are welcome to try it yourself, then when done release under any licensing of your choice.

Just adding my two cents here. This is an amazing book which covers a lot of material in a very smooth manner. I also love the fact that any added chapters may be seen by paying very little extra (and lectures). It certainly goes over the basics but it also covers some excellent tips for folks who are not "beginners". For the past few months I have been referring this book to several people as I truly think it will help them.

Given the amount of effort given to write the material and maintain this forum, it is needless to say that Dr. Albert has the right to add this thread about the book. And thank you Dr. Albert, for the excellent material that you shared.

Thanks for the kind words. Much appreciated - I most pleased of the observation that the book is not just for beginners. I have made a concerted effort to have something useful to say even to those that are more experienced.

Um, you're welcome to continue using the site and its content gratis, so it's unclear what your beef is...

5.3 years ago
Diego ▴ 50

This is amazing! Coming from software engineering, I'm pretty new to this "bio" part of bioinformatics. I'm going to buy it as soon as I can,

0
Entering edit mode
4.7 years ago
najibveto ▴ 70

thanks a lot for this amazing book and i am enjoying reading the online version. i want to know if possible to add some chapter such as :

• study of microrna-seq in model and non model organism.
• study of RNA-seq in non model organism.
thanks a lot for your hard work.
Thank you for the kinds words - much appreciated.

A chapter on small RNA analysis would indeed be of interest to many readers and it is a section that is in the planning stages. In any case, it would be pursued after the current task of finishing the online course on the existing materials.

4.0 years ago
BioBing ▴ 150

I love it! Thanks for an amazing book - and the courses in bioinformatics and python!!!

When following the courses, is there a forum/thread to discuss the different lectures and topics?

Thanks for the kind words. I am happy to hear that the resources are effective.

You are right that we ought to have a forum of some sorts - especially since now the site hosts a book and two courses all going in parallel. We are working on addressing this shortcoming. We will likely have a forum in place this Summer - will be announced on the mailing list.

In the Fall the bioinformatics course will also likely gain a few video sections as well, similar to what the Python course has.

Perhaps the Biostars Slack group would be ideal for this? You could request to create a channel for the book itself and then for the lectures. It is worth a try (maybe for a test period of time).

+1 on this idea - It's a significant value addition to the Slack workspace

Thanks for book Istvan. Last year, I also bought a copy of the book with 2 years free update benefits. I was wondering if you have planned for the next update? I haven't seen any content update since Sep 2017.

There was more new content over 2017 - just a different kind - I have explored different directions this year:

1. An online course has been added with 30 lectures finished in December 2017. I would recommend taking (and re-taking) the tests.
2. A Python course has been started and hopefully will be finished this year.

There may be new chapters this Fall when I teach the course again - though I would expect something new there as well.

3.8 years ago
Fanta • 0

What are the pre-requisites to be able to use the book (and the training) for self-teaching?

What are the pre-requisites to be able to use the book (and the training) for self-teaching?

Willingness

What are the pre-requisites to be able to use the book (and the training) for self-teaching?

Intelligence.

2.8 years ago
dago ★ 2.7k

Hi guys, the book is great! Is there any plan to extend the metagenomic part? It would be nice to include other tools and approaches for both wgs and 16S metagenome In case you can suggest other tutorials outside the book I would also be happy to go over them! Thank you!

Hello dago - your suggestion to list additional learning resources after each chapter is excellent.

We will add that over the next semester.

On new chapters, it is a bit of a challenge to figure out what to cover next. PacBio, single-cell RNA-Seq and metagenomics are all appropriate subject areas. We'll see how the book evolves, we'll have new material, just not sure yet what topic ;-)

PacBio would be great.

2.0 years ago
ATCG ▴ 370

Hi, The Biostar Handbook is one of the best sources I have encountered in my long journey as a molecular biologist transitioning almost full-time to the field of bioinformatics. I am very interested to know you opinion about the Reactome database, I noticed that it is not covered as one of the tools for functional analysis and pathway analysis. I have used the online database https://reactome.org and recently I learned of the ReactomePA R package and I find it to be extremely user friendly but also most importantly comprehensive including several model organisms. It would be helpful for me if you could comment on your opinion on this particular database. Thank you!

Thanks for the feedback. Appreciated.

Yes, I agree that the book should describe Reactome (and KEGG as well) as the next step when performing functional analysis. I will look at integrating more information on each. I am glad that you discovered them on your own.

That being said the decision making while using these resources (and functional analysis in general) cab be somewhat subjective, proper interpretation of the results requires more in-depth expertise regarding the specific biological problem that is being studied. Hence it is more difficult to cover the subject with the same approach as the rest of the book. But I will seriously consider adding a chapter before the next semester.

21 months ago
ATCG ▴ 370

This book has been an amazing resource for me. I would love to have updates that would include the analysis of Ribo-Seq data or usage of python packages. Biopython, anaconda, conda, miniconda etc. Some suggestions are:

Thanks for the feedback, appreciated. We'll take the recommendation under consideration. Thanks for the links.

4 weeks ago

