Because bioinformatics is carving out an increasingly important place in research and because we have to help students to understand their future role in research, a simple but complex question came to mind: Who qualifies to be a bioinformatician?
Interesting paper - it brings up interesting issues - the trouble defining bioinformatics itself is an indication of how complex and interconnected this field of science is.
The aspect that I disagree with has to do with the authors trying a little too hard to define what bioinformatics is not. These attempts are almost never appropriate. The article seems to imply the following:
Buddy, you just maintain the blast server, that's not real bioinformatics. How about curating some of the data? But what if this person optimized blast to perform ideally for the jobs in question, uses interesting parameters to make blast operate in ways no one thought of before?
Hey Mister you just know how to run an RNA-Seq pipeline, that's not bioinformatics. How about writing some code? But what if the person knows how to pick the right tools for the data, knows exactly how to tune the search, deeply understands the effect of genomic features and structure on the results. etc.
I am actually biased here because of a personal experience, a few years ago I have had one faculty tell me that what I do is not "real bioinformatics" ...
I would call bioinformatician any person who works on biological problems and who's primary focus is the computational analysis instead of acquisition of raw data. I also believe there are three kinds of bioinformatic tasks which are equally important for the research:
Perform an in-depth analysis for a specific problem, e.g. check RNA-Seq data for splicing patterns of genes X and Y in a gene Z knockout mouse model. This can be quite laborious, but is usually done in a direct contact with biologists so your findings can be easily validated.
Develop algorithms and software tools that could be used by the bioinformatics community to perform a certain type of analysis. The main difficulty here is to develop a tool that will be stable enough to be applied to a wide range of datasets and can be easily understood and adapted by others.
Setting-up and maintaining pipelines, servers, etc. For me this is one of most difficult tasks, as it involves helping people to solve #1 bioinformatic tasks and dealing with problems of adapting software tools developed by others (#2), which will always remain no matter how highly skilled you are. Moreover even a simple RNA-Seq analysis can turn into a disaster if one has to analyze 1000 HiSEQ runs or gets a very important dataset which has lots of chimeras, strange adapter incorporations and other nasty stuff.