Question: Good Books And Resources For Figuring Out Biostatistics?
21
8.2 years ago by
United States

Despite having an education in 'bioinformatics', my current 'bioinformatics programmer' job involves somewhat more biostatistics than I am comfortable with- sure I could just plug numbers into all of these equations that pop up in the literature, but I start feeling rather uncomfortable in answering questions that people ask, and I'll also be putting together data processing workflows for microarrays, etc. My only formal biostats class involved a professor who was a brilliant stats guy but didn't really teach, and so was a poor experience. Basic stats is easy, but much of what I've seen of statistical processing for microarrays make it seem like these statistical methods are just being pulled out of a magic hat. Working with R makes it seem like a black box, and that's somewhat discomforting.

So the summary is that I'd like to see if anyone has some good recommendations on books/material for figuring out biostats for people who aren't exactly math-oriented. With particular regard to microarrays and analysis of data originating from high-throughput methods.

books statistics • 8.3k views
modified 1 day ago by ATpoint21k • written 8.2 years ago by Adamc600
19
8.2 years ago by
Manhattan, NY

Few resources that I found extremely useful for statistical analysis / interpretation of biological data:

• Handbook of Biological Statistics: html This is a resource that will make help you to think through various steps in statistical analysis.

• For microarray analysis I would happily recommend "Microarrays For An Integrative Genomics" by Kohane, Kho and Butte. This is an amazing book on genomics (written in a easily accessible, text-book style format). This book explains various aspects of microarray analysis (biology, statistics, analysis, interpretation in great detail). It does not discuss any programming language, but provide pseudo-code to understand the concept, but you can easily adapt in your language of interest.

• This is an incredible lecture by Professor Warren Ewens on introduction of biostatistics from the perspective of genetics or genomics.

It is difficult to point to a single book that cover various statistical approaches in "high-throughput biology". IMHO, biological experiments are using every other statistical techniques out there. The statistical method that you should apply to your dataset will depend up on various aspects including your data and the question you want to answer. So, here I would like to point you to generic resource that you can use for better understanding of various statistical tests / models / methods.

• Statistics materials at Wolfram, Statistica, CMH website

• Statistics from machine learning / computer science perspective: http://www-stat.stanford.edu/~tibs/ElemStatLearn/ This will be useful if you are dealing with machine learning based approaches for the analysis of your biological data

• POLS Statistics is a very useful resource. For example here is a link with good description of all the major distribution that you will encounter in statistics.

• I would like to recommend a recent book that covers various generic statistical concepts from a mathematical perspective. See Data Analysis with Open Source Tools, a highly readable book that provides good understanding of various statistical methods like modeling, analysis, data mining with good description on the mathematical / statical background of the concepts. The book also use open access tools like NumPy, GnuPlot for the analysis and visualization of data.

• Think Stats: Probability and Statistics for Programmers PDF, which is a nice resource (also available from O'Reilly as a printed book)

• I also consult R / BioConductor package vignettes to understand the statistical background about the tests employed in individual packages

That O'Reilly data analysis book looks pretty interesting, not sure if I've seen that before.

Great lecture: by Professor Warren Ewens on introduction of biostatistics from the perspective of genetics or genomics. Thanks for the link.

10
8.2 years ago by
Boboppie530
Cambridge, UK
Boboppie530 wrote:

I just bought this one - "Intuitive Biostatistics". Really enjoyed the chapter on p-value ;)

Looks interesting. I haven't heard of this one. The Amazon reviews look really good for it too.

Thanks. I have seen this book in the lab, I should read it.

Saw that one on amazon before, but wasn't sure how good it was, so thanks for the tip.

6
7.7 years ago by
Gjain5.3k
Munich, Germany
Gjain5.3k wrote:

This is the same problem I had faced a lot. The books and resources mentioned above are all excellent. Couple of resources that were very helpful to me other than mentioned above are:

1. Choosing and Using Statistics: A Biologist's Guide (link)
2. Statistics at the Bench: A Step-by-Step Handbook for Biologists (link)

Apart from these book, some of the articles that were very helpful to me recently:

1. P-values, False Discovery Rate (FDR) and q-values(link)
2. Principal component analysis:
• What is principal component analysis?(link)
• Principal component analysis of genetic data(link)
3. Collection of Biostatistics Research Archive(link)

One very nice place to learn general concepts about statistics which can be applied in this field: Khan Academy Statistics videos (link)

5
8.2 years ago by
Michael Barton1.8k
Akron, Ohio, United States
Michael Barton1.8k wrote:

I think "Statistics: an introduction using R" is a good book. Short and to point on the theory and application of the common basic statistical methods. The R examples could also be useful if you're currently using this language.

1
2.7 years ago by
uzparacha10
uzparacha10 wrote:

I have recently published an ebook on Biostatistics.

Biostatistics – When Pain becomes Treatment- http://amzn.to/2gh2M1F

Hopefully, it will be of help to you...

1
1 day ago by
ATpoint21k
Germany
ATpoint21k wrote:

https://www.huber.embl.de/msmb/

0
1 day ago by
lavinia.andreea23890 wrote:

This page has a pretty wide list of resources which covers the whole gamut from the actual maths side of things to graphs, software code and clinical study design: https://www.anatomisebiostats.com/resources.html