Introduction to Python for biologists

http://www.prstatistics.com/course/introduction-to-python-for-biologists-ipyb04/

This course is being delivered by Dr Martin Jones, an expert in Python and author of two text books,

Python for Biologists [http://www.amazon.com/Python-Biologists-complete-programming-beginners/dp/1492346136/]

Advanced Python for Biologists [http://www.amazon.com/Advanced-Python-Biologists-Martin-Jones/dp/1495244377/]

This course will run from the 27th November - 1st December 2017 at Margam discovery centre, Wales.

Both course only and all inclusive packages (including accommodation and meals) are available.

This course is followed by the course "Data visualisation and manipulation using Python" which will be held at the same venue for convenience, a 10% discount will be applied to the total when booked together.

http://www.prstatistics.com/course/data-visualisation-and-manipulation-using-python-dvmp01/

Course overview: Python is a dynamic, readable language that is a popular platform for all types of bioinformatics work, from simple one-off scripts to large, complex software projects. This workshop is aimed at complete beginners and assumes no prior programming experience. It gives an overview of the language with an emphasis on practical problem-solving, using examples and exercises drawn from various aspects of bioinformatics work. After completing the workshop, students should be in a position to (1) apply the skills they have learned to tackle problems in their own research and (2) continue their Python education in a self-directed way.

Intended audience: This workshop is aimed at all researchers and technical workers with a background in biology who want to learn programming. The syllabus has been planned with complete beginners in mind; people with previous programming experience are welcome to attend as a refresher but may find the pace a bit slow.

Teaching format: The workshop is delivered over ten half-day sessions (see the detailed curriculum below). Each session consists of roughly a one hour lecture followed by two hours of practical exercises, with breaks at the organizer’s discretion. There will also be plenty of time for students to discuss their own problems and data.

Assumed background: Students should have enough biological background to appreciate the examples and exercise problems (i.e. they should know about DNA and protein sequences, what translation is, and what introns and exons are). No previous programming experience or computer skills (beyond the ability to use a text editor) are necessary, but you'll need to have a laptop with Python installed.

Curriculum:

Day 1

Module 1: Introduction.

We will start with a general introduction to Python and explain why it is useful and how learning to program can benefit your research. Some time will be taken to explain the format of the course. We will outline the edit-run-fix cycle of software development and talk about how to avoid common text editing errors. In this session, we also check that the computing infrastructure for the rest of the course is in place. Core concepts introduced: source code; text editors; whitespace; syntax and syntax error; and Python versions.

Module 2: Output and text manipulation.

This session will show students how to write very simple programs that produce output to the terminal and in doing so become comfortable with editing and running Python code. This session also introduces many of the technical terms that we’ll rely on in future sessions. We will run through some examples of tools for working with text and show how they work in the context of biological sequence manipulation. We also cover different types of errors and error messages and learn how to go about fixing them methodically. Core concepts introduced: terminals; standard output; variables and naming; strings and characters; special characters; output formatting; statements; functions; methods; arguments; comments.

Day 2

Module 3: File IO and user interfaces.

We will discuss about the importance of files in bioinformatics pipelines and workflows during this session, and we then explore the Python interfaces for reading from and writing to files. This involves introducing the idea of types and objects and a bit of discussion about how Python interacts with the operating system. The practical session is spent combining the techniques from session 2 with the file IO tools to create basic file-processing scripts. Core concepts introduced: objects and classes; paths and folders; relationships between variables and values; text and binary files; newlines.

Module 4: Flow control 1: loops.

A discussion of the limitations of the techniques learned in session 3 quickly reveals that flow control is required to write more sophisticated file-processing programs, at this point we will progress on to the concept of loops. We look at the way in which Python loops work, and how they can be used in a variety of contexts. We explore the use of loops and lists together to tackle some more difficult problems. Core concepts introduced: lists and arrays; blocks and indentation; variable scoping; iteration and the iteration interface; ranges.

Day 3

Module 5: Flow control 2: conditionals.

We will use the idea of decision-making in session 5 as a way to introduce conditional tests and outline the different building-blocks of conditions before showing how conditions can be combined in an expressive way. We look at the different ways that we can use conditions to control program flow, and how we can structure conditions to keep programs readable. Core concepts introduced: Truth and falsehood; Boolean logic; identity and equality; evaluation of statements; branching.

Module 6: Organizing and structuring code. In session 6 we will discuss functions that we would like to see in Python before considering how we can add to our computational toolbox by creating our own. We examine the nuts and bolts of writing functions before looking at best-practice ways of making them usable. We also look at a couple of advanced features of Python – named arguments and defaults. Core concepts introduced: argument passing; encapsulation; data flow through a program.

Day 4

Module 7: Regular expressions.

A range of common problems in bioinformatics can be described in terms of pattern matching; we will discuss these and give an overview of Python’s regex tools. We look at the building blocks of regular expressions themselves, and learn how they are a general solution to the problem of describing patterns in strings, before practising writing some specific examples of regular expressions. Core concepts introduced: domain-specific languages; sessions and namespaces.

Module 8: Dictionaries. We discuss a few examples of key-value data and see how the problem of storing them is a common one across bioinformatics and programming in general. We learn about the syntax for dictionary creation and manipulation before talking about the situations in which dictionaries are a better fit that the data structures we have learned about thus far. Core concepts introduced: paired data types; hashing; key uniqueness; argument unpacking and tuples.

Day 5

Module 9: Interaction with the file system.

In the final session e discuss the role of Python in the context of a bioinformatics workflow, and how it is often used as a language to “glue” various other components together. We then look at the Python tools for carrying out file and directory manipulation, and for running external programs – two tasks that are often necessary in order to integrate our own programs with existing ones. Core concepts introduced: processes and sub-processes; the shell and shell utilities; program return values.

Please email any inquiries to oliverhooker@prstatistics.com or visit our website www.prstatistics.com

Please feel free to distribute this material anywhere you feel is suitable

Upcoming courses - email for details oliverhooker@prstatistics.com

ADVANCES IN MULTIVARIATE ANALYSIS OF SPATIAL ECOLOGICAL DATA USING R #MVSP 3rd – 7th April 2017, Scotland, Prof. Pierre Legendre, Dr. Olivier Gauthier http://www.prstatistics.com/course/advances-in-spatial-analysis-of-multivariate-ecological-data-theory-and-practice-mvsp02/

ADVANCING IN STATISTICAL MODELLING FOR EVOLUTIONARY BIOLOGISTS AND ECOLOGISTS USING R #ADVR 17th – 21st April 2017, Scotland, Dr. Luc Bussiere, Dr. Ane Timenes Laugen http://www.prstatistics.com/course/advancing-statistical-modelling-using-r-advr06/

CODING, DATA MANAGEMENT AND SHINY APPLICATIONS USING RSTUDIO FOR EVOLUTIONARY BIOLOGISTS AND ECOLOGISTS #CDSR 15th - 19th May, Scotland Dr. Aline Quadros http://www.prstatistics.com/course/coding-data-management-and-shiny-applications-using-rstudio-for-evolutionary-biologists-and-ecologists-cdsr01/

GEOMETRIC MORPHOMETRICS USING R #GMMR 5th – 9th June 2017, Scotland, Prof. Dean Adams, Prof. Michael Collyer, Dr. Antigoni Kaliontzopoulou http://www.prstatistics.com/course/geometric-morphometrics-using-r-gmmr01/

MULTIVARIATE ANALYSIS OF SPATIAL ECOLOGICAL DATA #MASE 19th – 23rd June, Canada, Prof. Subhash Lele, Dr. Peter Solymos http://www.prstatistics.com/course/multivariate-analysis-of-spatial-ecological-data-using-r-mase01/

TIME SERIES MODELS FOR ECOLOGISTS USING R (JUNE 2017 #TSME 26th – 30th June, Canada, Dr. Andrew Parnell http://www.prstatistics.com/course/time-series-models-foe-ecologists-tsme01/

BIOINFORMATICS FOR GENETICISTS AND BIOLOGISTS #BIGB 3rd – 7th July 2017, Scotland, Dr. Nic Blouin, Dr. Ian Misner http://www.prstatistics.com/course/bioinformatics-for-geneticists-and-biologists-bigb02/

META-ANALYSIS IN ECOLOGY, EVOLUTION AND ENVIRONMENTAL SCIENCES #METR01 24th – 28th July, Scotland, Prof. Julia Koricheva, Prof. Elena Kulinskaya http://www.prstatistics.com/course/meta-analysis-in-ecology-evolution-and-environmental-sciences-metr01/

SPATIAL ANALYSIS OF ECOLOGICAL DATA USING R #SPAE 7th – 12th August 2017, Scotland, Prof. Jason Matthiopoulos, Dr. James Grecian http://www.prstatistics.com/course/spatial-analysis-ecological-data-using-r-spae05/

ECOLOGICAL NICHE MODELLING USING R #ENMR 16th – 20th October 2017, Scotland, Dr. Neftali Sillero http://www.prstatistics.com/course/ecological-niche-modelling-using-r-enmr01/

INTRODUCTION TO BIOINFORMATICS USING LINUX #IBUL 16th – 20th October, Scotland, Dr. Martin Jones http://www.prstatistics.com/course/introduction-to-bioinformatics-using-linux-ibul02/

GENETIC DATA ANALYSIS AND EXPLORATION USING R #GDAR 23rd – 27th October, Wales, Dr. Thibaut Jombart, Zhian Kavar http://www.prstatistics.com/course/genetic-data-analysis-exploration-using-r-gdar03/

STRUCTURAL EQUATION MODELLING FOR ECOLOGISTS AND EVOLUTIONARY BIOLOGISTS USING R #SEMR 23rd – 27th October, Wales, Prof Jarrett Byrnes, Dr. Jon Lefcheck http://www.prstatistics.com/course/structural-equation-modelling-for-ecologists-and-evolutionary-biologists-semr01/

LANDSCAPE (POPULATION) GENETIC DATA ANALYSIS USING R #LNDG 6th – 10th November, Wales, Prof. Rodney Dyer http://www.prstatistics.com/course/landscape-genetic-data-analysis-using-r-lndg02/

APPLIED BAYESIAN MODELLING FOR ECOLOGISTS AND EPIDEMIOLOGISTS #ABME 20th - 25th November 2017, Scotland, Prof. Jason Matthiopoulos, Dr. Matt Denwood http://www.prstatistics.com/course/applied-bayesian-modelling-ecologists-epidemiologists-abme03/

INTRODUCTION REMOTE SENSING AND GIS APPLICATIONS FOR ECOLOGISTS #IRMS 27th Nov – 1st Dec, Wales, Dr Duccio Rocchini, Dr. Luca Delucchi http://www.prstatistics.com/course/introduction-to-remote-sensing-and-gis-for-ecological-applications-irms01/

INTRODUCTION TO PYTHON FOR BIOLOGISTS #IPYB 27th Nov – 1st Dec, Wales, Dr. Martin Jones http://www.prstatistics.com/course/introduction-to-python-for-biologists-ipyb04/

DATA VISUALISATION AND MANIPULATION USING PYTHON #DVMP 11th – 15th December 2017, Wales, Dr. Martin Jones http://www.prstatistics.com/course/data-visualisation-and-manipulation-using-python-dvmp01/

ADVANCING IN STATISTICAL MODELLING USING R #ADVR 11th – 15th December 2017, Wales, Dr. Luc Bussiere, Dr. Tom Houslay, Dr. Ane Timenes Laugen, http://www.prstatistics.com/course/advancing-statistical-modelling-using-r-advr07/

INTRODUCTION TO BAYESIAN HIERARCHICAL MODELLING #IBHM 29th Jan – 2nd Feb 2018, Scotland, Dr. Andrew Parnell http://www.prstatistics.com/course/introduction-to-bayesian-hierarchical-modelling-using-r-ibhm02/

ANIMAL MOVEMENT ECOLOGY (February 2018) #ANME ??th - ??th February 2018, Wales, Dr Luca Borger, Dr. John Fieberg

AQUATIC TELEMENTRY DATA ANALYSIS USIR R (TBC) #ATDAR ??th - ??th February 2018, Wales,

FUNCTIONAL ECOLOGY FROM ORGANISM TO ECOSYSTEM: THEORY AND COMPUTATION #FEER 5th – 9th March 2018, Scotland, Dr. Francesco de Bello, Dr. Lars Götzenberger, Dr. Carlos Carmona http://www.prstatistics.com/course/functional-ecology-from-organism-to-ecosystem-theory-and-computation-feer01/

STABLE ISOTOPE MIXING MODELS USING SIAR, SIBER AND MIXSIAR #SIMM Dr. Andrew Parnell, Dr. Andrew Jackson – Date and location to be confirmed

NETWORK ANAYLSIS FOR ECOLOGISTS USING R #NTWA Dr. Marco Scotti - Date and location to be confirmed

MODEL BASE MULTIVARIATE ANALYSIS OF ABUNDANCE DATA USING R #MBMV0 Prof David Warton - Date and location to be confirmed

ADVANCED PYTHON FOR BIOLOGISTS #APYB Dr. Martin Jones - Date and location to be confirmed

PHYLOGENETIC DATA ANALYSIS USING R (TBC) #PHYL Dr. Emmanuel Paradis – Date and location to be confirmed

Oliver Hooker PhD. PR statistics

most recent publication - The physiological costs of prey switching reinforce foraging specialization - Journal of animal ecology - http://onlinelibrary.wiley.com/doi/10.1111/1365-2656.12632/full

prstatistics.com

facebook.com/prstatistics/

twitter.com/PRstatistics

groups.google.com/d/forum/pr-statistics-post-course-forum

prstatistics.com/organiser/oliver-hooker/

3/1, 128 Brunswick Street Glasgow G1 1TF

+44 (0) 7966500340