Question: Best Language For Introductory Programming Course From Within An Introduction Course On Bioinformatics.
8
gravatar for Andra Waagmeester
8.2 years ago by
Maastricht, the Netherlands
Andra Waagmeester3.2k wrote:

What language would you recommend to introduce programming to an audience of biology/life science students at a bachelor level?

In our introductory course Bioinformatics we currently use perl as the teaching language to introduce life science students to the concepts behind programming. Due to a change in the curriculum we need to reassess the structure. Also the number of students will be increasing to a number where continuing with perl practical sessions could become too labour intensive.

On stack overflow I saw an almost duplicate question. There are some really nice answers given there. My question here boils down to: Are the answers at the stack overflow question also applicable in a course for an audience of bioinformaticians/biologists?

ADD COMMENTlink modified 3.3 years ago by Biostar ♦♦ 20 • written 8.2 years ago by Andra Waagmeester3.2k
3

I'm surprised no one asked you so far: how many hours is the course? That might have a huge impact on the suggestions (teaching proper programming vs. some quick hacks).

ADD REPLYlink written 8.2 years ago by Michael Schubert6.9k
2

Ohh and the main purpose of this course is not to learn them programming. But to make them understand enough about what programming can do so they will ask somebody for help or decide to learn when appropriate.

ADD REPLYlink written 8.2 years ago by Chris Evelo10.0k

Here's a relevant thread from Ask Metafilter, in which I recommend Python: http://ask.metafilter.com/125801/Best-language-for-highschool-bioinformatics-course

ADD REPLYlink written 8.2 years ago by Chris Miller21k

Absolutely, that is in fact why we do Perl now. It is just a very quick introduction. Total contact hours 6, total workload about 20.

ADD REPLYlink written 8.2 years ago by Chris Evelo10.0k

Then you might also want to consider http://www.taverna.org.uk/, which makes you get used to train of thought without any actual programming.

ADD REPLYlink written 8.2 years ago by Michael Schubert6.9k
1

Actually with Taverna you're doing visual programming. Most people only learns 2 or 3 textual programming and they innocently recommends them everywhere.

ADD REPLYlink written 7.0 years ago by veronicaschroeder78110

You should choose the language with less semantic gap for your audience. If you want your students copy and paste code, then you may teach them any language, if you want to make them to think, then choose a language which doesn't bother you. Don't buy problems for free. Choose the language with less keywords, less syntatic sugar, less unnecessary concepts to learn. Research to find the most human-oriented language.

ADD REPLYlink written 7.0 years ago by veronicaschroeder78110
23
gravatar for Konrad
8.2 years ago by
Konrad690
Germany
Konrad690 wrote:

As mentioned by many here and at Stackoverflow I would recommend Python (often called "executable pseudocode") especially as a first language due to different reasons:

ADD COMMENTlink written 8.2 years ago by Konrad690
2

Here's a relevant thread from Ask Metafilter, in which I also recommend Python: http://ask.metafilter.com/125801/Best-language-for-highschool-bioinformatics-course

ADD REPLYlink written 8.2 years ago by Chris Miller21k
1

Another big advantage to Python is interactive plotting with matplotlib (http://matplotlib.sourceforge.net/). On top of being just plain useful, incorporating it into a course could be beneficial for students who are visual learners.

ADD REPLYlink written 8.2 years ago by Ryan Dale4.8k

MIT also uses Python for introductory programming course. You can find some materials there: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-00-introduction-to-computer-science-and-programming-fall-2008/ http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-189-a-gentle-introduction-to-programming-using-python-january-iap-2011/

ADD REPLYlink written 8.1 years ago by Nikolay Vyahhi1.2k

Where are the warrants that supporting many programming paradigms is better that to support just one and support it well enough? Even programming in Assembler or Lisp can be fun, and believe me Lisp is way more used in scientific community (or don't believe me and make a search of Lisp in ieeexplore). Besides, most mainstream languages are free and open-source and have thousands of teaching material, no more a remarkable point. It's incredible how people buy technology without serious or formal training in computer science.

ADD REPLYlink modified 7.0 years ago • written 7.0 years ago by veronicaschroeder78110

The advantage of having several paradigms in once language is that it makes it easy to present those without switching between different languages. I made the point regarding free/open source as there are still many educational institutions which are using MatLab and later students recognize that the license fees are significant (I have real cases in my research environment where people asked me to re-implement programs as they brought old MatLab programs from their previous labs and did not want to buy MatLab licenses from their own budget). Yes, there is Octave but the compatibility is not 100% when you more complex stuff. PS: Don't take that the "It's fun" statement too serious. ;) The question was about introductionary languages. Easy obtained results make people happy and motivates them to continue - Python definitely offering this. Onces they have the basics they can start with LISP, Assembler, LOLCODE, brainfuck or whatever they want and can have a lot of fun with that, too.

ADD REPLYlink written 7.0 years ago by Konrad690

It seems you've assumed that just one language can be enough for many paradigms. Paradigm shift involves language shift (any book of history of science supports that claim). Pretending one programming language to fit many paradigms is an illusion, sorry for the bad news ;). Introductory language is one of those made for that purpose, like Pascal, Scheme or Smalltalk, because they include few or clear concepts. Python was a language created for interfacing the Amoeba OS because its developer didn't was very proficient in Bourne shell scripting. Good point about MatLab by the way.

ADD REPLYlink written 6.9 years ago by veronicaschroeder78110

Well, he solved a personal problem and invented a elegant language (or better said modified another one heavily) - fine for me :). And yes, you are right - if you want to dive deeply into a paradigm you really have to switch to a language that really embraces it. But as said - the question was about an introductory language for beginners who want to get stuff done not about how to fill a curriculum of a computer science student. Start with Python and keep your mind open was implied the bold font of "first language" in my recommendation. (PS: I can highly recommend "Seven languages in seven week" by Bruce A. Tate for anyone who likes to get a taste of different paradigms in a very entraining manner.) But actually Python is great introductory language that can be properly used in practice (not the case for Pascal IMO; Scheme I actually really used only for training purposes so far). Anyway, if you have a different feeling about this topic you can simply propose your mentioned languages as solution here. I don't have hard numbers on the educational efficiency of different languages available so maybe other share your opinion.

ADD REPLYlink written 6.9 years ago by Konrad690
17
gravatar for Agapow
8.2 years ago by
Agapow270
London, UK
Agapow270 wrote:

There was a lengthy discussion on this at LinkedIn, and I'm going to largely repeat my comments from there. You'll get a lot of different opinion on this because:

  1. it's a religious issue (i.e. comes down a lot to subjective judgements and personal experience), and
  2. there's a lot of possible considerations for language choice in bioinformatics courses: teachable to people who aren't just going to be programmers and may not have programmed before, has a lot of useful libraries, has a community behind, good for quick and dirty / one off scripting solutions, useful for web development, etc.
  3. What "bioinformatics" means to one person and another can be quite different. I'm a bioinformaticist, you're a computational biologist, you're a genomicist and you just do a few stats ...

So a few thoughts about different languages:

Old school compiled languages, e.g. C/C++: No. Learning curve too high, no good for quick-and-dirty problems, weak in web development. Relatively little bioinformatic work happening here. Not a good place to start.

Java: Lots of libraries and BioJava is pretty damn good. But it's not a great first language, and always feels a bit "heavy" when I'm trying to do solve a small problem. Still, I expect to see a lot of development in this area with the JVM enabled languages like Jython, JRuby, Groovy, where you can script and still use the Java libraries. Not for novices.

Perl: was the undisputed choice for bioinformatics 10 years ago but that lead has evaporated. Quirky, opaque and write once. The whole Perl 6 morass doesn't help. I think you can do better. Still, there's a lot of code here and a lot of the older significant tools are written in this (e.g. GBrowse etc.)

Ruby: I've got a love-hate relationship with Ruby. There's a lot of Good Stuff there, and the web development is excellent. People seem to like learning Ruby too. But there are a few quirks in the language and BioRuby is still a work in progress. Still, a lot of enthusiasm here.

Python: this is where the weight of attention is. BioPython has really come along in the last few years and many of the newer, excellent tools (e.g. Galaxy) are written in it. Easy to learn, kind to beginners, big community, good scientific computing support (IPython, NumPy, etc.). There's an odd aspect or two I wish was developed more (I'd really like anonymous closures and better functional programming) but you couldn't go wrong here.

Javascript: many people rave about what a great language JS is, and there are occasional feints at doing bioinformatics in it. But while you _can_ do work in it, _should_ you? Nope.

R: A lot of ecologists & mathematical biologists use R, and it's got graphics & visualization to die for. The IDE is great for beginners as well, allowing packages to easily be installed locally. I confess to a bit of a blindspot with R (some of the syntax is a bit weird), but this could be the right choice for the right group of students.

ADD COMMENTlink modified 8.2 years ago • written 8.2 years ago by Agapow270
12
gravatar for Lyco
8.2 years ago by
Lyco2.3k
Germany
Lyco2.3k wrote:

I am sure that mine will be a minority opinion, but alas, I am a biologist myself and therefore see this question from a different angle. In my experience, biologists and related life scientists will need programming languages mainly for scripting (e.g. writing command pipelines), and for processing large amounts of textual or numerical data.

I would really recomment sticking with PERL, as I consider it most accessible for non CS people. In my opinion, the major advantage of PERL is that you can avoid object-oriented programming. (No mistake, OOP is a very powerful concept for professionals, but I don't think that a biologist should be bothered with it) PERL is also very powerful for text processing and has a very complete support for regular expressions. This might not be the tool of choice for the people hanging out at BioStar, but for the average biologist things look quite different.

I have seen R being recommended here. I definitively recommend teaching R to biology students, but I never quite got around the concept of R as a 'general programming language'. I would rather recommend to teach students how to run R procedures from another programming language (e.g. PERL).

ADD COMMENTlink written 8.2 years ago by Lyco2.3k
1

+1 for including perl (it is a valid option, even though it might introduce bad habits), -1 for excluding R as a general programming language ;) of course it is, just the fact that you didn't get 'around the concept' doesn't make it less usable, and yes, it is a full blown programming language (if you still don't believe that you might have to read up on the theoretical background of programming languages), it's just that it is more appropriate or easier to get to a solution for certain types of problems, but that is true for any language anyway.

ADD REPLYlink written 8.2 years ago by Michael Dondrup46k
1

Michael: of course R is Turing complete, but using it for general-purpose programming is just awkward.

ADD REPLYlink written 8.2 years ago by Michael Schubert6.9k

Agreed. Never thought R was a 'general programming language', for me, it's just a tool for statistical analyses.

ADD REPLYlink written 8.2 years ago by Vitis2.2k

+1. R as a general-purpose language is horrible. Don't agree that Perl is a good choice though for beginners (too many implicit commands).

ADD REPLYlink written 8.2 years ago by Michael Schubert6.9k

I agree with Michael S. R is very powerful for statistics and plotting. I encourage biologists to learn it. As a programming language, R is bad but still acceptable. However, when we come to the implementation, the official one is easily the least inefficient. Use R where it has strength and never take R as a serious general-purpose programming language.

ADD REPLYlink written 8.1 years ago by lh331k
7
gravatar for lh3
8.2 years ago by
lh331k
United States
lh331k wrote:

From all I have seen so far, biologists mostly need programming for large-scale text processing and for doing simple statistics on large data sets. I think a combination of Python and R is the best for them. If you have to choose one language, then Python and you can teach your students how to use the existing modules (numpy?) to do statistics. The problem with R is it is frequently awkward for text processing and for handling huge data sets over 10GB for example, while python does not have the problem and you can still use Python to do most of the basic things R can do.

PS: Personally I know little about Python and think R, as a programming language, gets implemented very badly (actually the worst). I like Javascript and Lua more these days, but for biologists, Python+R should suit them much better. MatLab is better than R as a programming language IMHO, but it is not free and perhaps lacks the rich packages in R. Perl is still a decent choice even if today. Some advanced modules only exist in Perl (though the same may be true for Python; I do not know).

ADD COMMENTlink written 8.2 years ago by lh331k
6
gravatar for Aleksandr Levchuk
8.2 years ago by
United States
Aleksandr Levchuk3.2k wrote:

Ruby for the first time and for all time. Pickup other languages as you need them.

ADD COMMENTlink modified 8.2 years ago • written 8.2 years ago by Aleksandr Levchuk3.2k

I'm on the Ruby train as well. Not a lot of love so far from the other answers, but 1) its super quick to pick up the basics (created to make programmers happy - and it does!) 2) powerful text manipulation - it does what perl does, but with less gotchas, memorization, and ugliness (IMHO) 3) Documentation is plentiful 4) Can move on to using Rails if wanting to build database driven websites 5) tons of gems and options for integrating with 3rd party tools (an important skill to learn).

ADD REPLYlink written 8.2 years ago by Jim Vallandingham340

I started on ruby a few years ago, and loved it. I'd switched to perl, java, and R for a job. Recently I came back to ruby and realized how many wonderful things there were, that are totally missing from other scripting languages like python and perl.

ADD REPLYlink written 8.2 years ago by Burlappsack660

Thanks for giving votesup to Ruby. I also use R, Bash, GNU programs, Python, and SQL extensively and consider myself an expert at those. In addition I have multiple have years of experience in each of: JS, PHP, C, C++, VB, C#. And some experience in: Java, Perl, and VHDL. Among all of those Ruby, R, GNUs, and Bash really stand out. Bash - pipelining/streaming as the OS. GNU tools do wonders in numerous text processing niches. R - reproducible research and condensed scientific intelligence in every line. But Ruby is the most concise and beautiful in general - I use it to connect the components.

ADD REPLYlink written 8.2 years ago by Aleksandr Levchuk3.2k

Thanks for giving votesup to Ruby. I also use R, Bash, GNU programs, Python, and SQL extensively and consider myself an expert at those. In addition I have multiple years of experience in each of: JS, PHP, C, C++, VB, C#. And some experience in: Java, Perl, and VHDL. Among all of those Ruby, R, GNUs, and Bash really stand out. Bash - pipelining/streaming as the OS. GNU tools do wonders in numerous text processing niches. R - reproducible research and condensed scientific intelligence in every line. But Ruby is the most concise and beautiful in general - I use it to connect the components.

ADD REPLYlink written 8.2 years ago by Aleksandr Levchuk3.2k
2
gravatar for Benm
8.2 years ago by
Benm710
Benm710 wrote:

I also think PERL, Python, R are the most useful language for bioinformatics, they are easy to study, and they have powerful modules and packages such as: CPAN, CRAN, BioPerl, BioPython(numpy, scipy), so that you can flexibly use them to deal with your tons of biological data.

For Learning bioinformatics and data analysis or programming, I recommend these few books for beginners:

  • Python Scripting for Computational Science
  • Bioinformatics Programming Using Python
  • Bioperl Course
  • Python course in Bioinformatics
  • Python for Bioinformatics
  • Bioinformatics Biocomputing and Perl
  • GENOMIC PERL-From Bioinformatics Basics to Working Code
  • Mastering Perl For Bioinformatics
  • Applied Statistics for Bioinformatics using R
  • Statistics Using R with Biological Examples
ADD COMMENTlink modified 8.2 years ago • written 8.2 years ago by Benm710
1

if possible, a delimiter for your book titles would be helpful.

ADD REPLYlink written 8.2 years ago by Aaronquinlan11k

made it a list :)

ADD REPLYlink written 8.2 years ago by Michael Schubert6.9k

I can share them with dropbox, do you have dropbox account?

ADD REPLYlink written 8.2 years ago by Benm710

thank you Michael

ADD REPLYlink written 8.2 years ago by Benm710
1
gravatar for Pierre Lindenbaum
8.2 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum122k wrote:
  • A simple sql language ? storing +querying data using sqlite or extracting data from the ucsc mysql server ?
  • Don't require any tool but a browser: javascript , xslt
  • using a document oriented database (couchdb , neo4j ...)
  • simple unix command lines
  • ...
ADD COMMENTlink written 8.2 years ago by Pierre Lindenbaum122k
1

You seem to be suggesting that they teach sql AND javascript. Plus, depending on the context, teaching bioinformatics requires much more than just databases.

ADD REPLYlink written 8.2 years ago by Byron Smith30

No, what I wanted to say is that those solutions are cheap and simple to teach.

ADD REPLYlink written 8.2 years ago by Pierre Lindenbaum122k
1
gravatar for Boboppie
8.2 years ago by
Boboppie530
Cambridge, UK
Boboppie530 wrote:

Python undoubtedly is an ideal language for teaching basic programming. It's very elegant designed and easy reading. I personally think Perl is still The language for biological science due to large collection of libs/modules (Python is growing rapidly in such aspect). But Perl is also evolving, Perl 6 might bring us some surprises. But for now I'd recommend Python if only one language needs to be chosen.

ADD COMMENTlink written 8.2 years ago by Boboppie530
1
gravatar for Dave Lunt
8.2 years ago by
Dave Lunt2.0k
Hull, UK
Dave Lunt2.0k wrote:

It is very easy to see this from the point of view of bioinformaticians but as Lyco says here "[Perl] This might not be the tool of choice for the people hanging out at BioStar, but for the average biologist things look quite different" -this was an excellent point.

I see the benefits of both Perl and Python but (a) Perl books are MUCH better for the biologist audience (b) most of the advantages of Python just don't exist for biologist learning to write a simple script (c) most biologists Google their problem to find code snippets, and there are more solutions in Perl (at least for the problems I Google).

Now if you were training graduate students who needed to build programming skills you might argue this another way but Perl is a great choice here.

ADD COMMENTlink written 8.2 years ago by Dave Lunt2.0k
1
gravatar for brentp
8.2 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

There is no "best" language. But, since it's not been mentioned, I'd add that awk is a very good choice.

One can become very efficient in awk without writing much code as there are implicit loops over each line of input. This makes it very simple, even for a beginning programmer, to do useful stuff. In addition, it's tied closely to the shell--which is another language they'll eventually want to learn--so things like reading from stdin and writing to stdout will become more familiar.

Plus, many of the skills/conventions one learns in awk will translate (so to speak) well to other languages.

ADD COMMENTlink written 8.2 years ago by brentp23k
0
gravatar for Biogeek
8.2 years ago by
Biogeek170
Biogeek170 wrote:

Yes, functional languages - in particular those that they can start doing right away, seem to be the most successful. JavaScript is the most obvious option as most life scientists/biologists have come across it even if just for playing with CSS/HTML.

ADD COMMENTlink written 8.2 years ago by Biogeek170
0
gravatar for Will
8.2 years ago by
Will4.5k
United States
Will4.5k wrote:

I teach the exact course your describing here at Drexel. I would suggest either Matlab (if your school already has the liscences) or python. Both languages abstract away the nitty gritty of data structures which really helps to speed up the teaching of computational biology and not computer science.

ADD COMMENTlink written 8.2 years ago by Will4.5k
0
gravatar for Pasta
8.2 years ago by
Pasta1.3k
Switzerland
Pasta1.3k wrote:

No one mentioned PHP here. Of course, this language is perfect for web development and can easily interact with databases. But what people forget is that PHP is also a scripting language that you can launch from a console. When I was a biochemistry student, it was the first programming language I learnt and I found it pretty easy to learn, especially compared to Perl...

It is a really easy language to write programs with, you can also do regex and use the BioPHP libraries.

If you need a good alternative and you are not afraid to swim against mainstream, go PHP !

ADD COMMENTlink written 8.2 years ago by Pasta1.3k
0
gravatar for Aurobhima
8.2 years ago by
Aurobhima100
University of Birmingham
Aurobhima100 wrote:

I can only speak from personal experience.. I entered the world of bioinformatics with limited programming experience. I have worked exclusively with Python for the last two and a half years and found it to be one of the most enjoyable and easy to use languages I have every encountered (others being, C, C++, Java and Pascal).. It also comes with a great community to get help from when you are in trouble..

Python was designed to teach people to program and encourages good programming habits. BioPython also offers a lot of useful Bioinformatics tools, apparently not as extensive as BioPerl, but still very useful.

ADD COMMENTlink written 8.2 years ago by Aurobhima100
0
gravatar for Tiffani
8.2 years ago by
Tiffani150
St. Louis
Tiffani150 wrote:

I would say perl and ruby are also your best bet.

ADD COMMENTlink written 8.2 years ago by Tiffani150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1664 users visited in the last hour