Daniel Zerbino is a project leader at Ensembl Regulatory Resources a comprehensive online resource that aims to compile and categorize the information underlying the mechanisms of gene regulation in human and mouse cells.
Before joining Ensembl Dr. Zerbino developed Velvet one of the first de Bruijn graph based assemblers that made genome assembly possible with next-generation sequencing data. While genomic assembler did exist before it Velvet (released in 2008) was the first tool that made assembly really easy - "it just worked". It had no unexpected requirements, used simple data formats, running it required very little background knowledge. The mental model a user needed to understand how to tune the assembly was very simple: enumerate subsequences of a certain length with one program velveth then run the graph assembler with another tool velvetg. Even by today's standard the simplicity and ease of installation makes Velvet one of the most accessible tools in a bioinformaticians arsenal.
We've asked Dr. Zerbino on how he got started in bioinformatics:
"I studied maths and general engineering in France at the Ecole Polytechnique and the Ecole des Mines in Paris, where curiosity drew me to bioinformatics, so I did a couple of short internships: protein structure at the Universidad Autónoma in Madrid, pharmacokinetics at a spin off at the Institute Pasteur in Paris.
As I was coming to the end of my degree, and having to decide what to do with my life, my flatmate was applying to the EMBL PhD program, so I looked into it, stumbled upon EMBL-EBI's program and sent in an application. I was hired by Ewan Birney, who was obsessed by de Bruijn graphs at the time."
Daniel Zerbino of Velvet
What hardware do you use?
I've grown to be quite fond of my MacBook Air, which I use to SSH into Linux machines at the Sanger Institute or EMBL-EBI.
What is your text editor?
Vim, is there any other? ;-)
What software do you use for your work?
Lots of quick prototypes with Samtools, Bedtools and awk. Now that I work with the Ensembl team, I use MySQL, BioMart and the Ensembl API on a regular basis.
What do you use to create plots and charts?
R by habit, although I'm moving steadily towards Matplotlib. For day to day data exploration, I have a collection of command-line ASCII plotting tools (ascii_plots on Github, if you're curious).
What do you consider the best language to do bioinformatics with?
Depends on the task and preexisting code, really. Given the choice, I'm a C and Python programmer.
What bioinformatics tools/software do not get enough recognition?
I'm obviously biased here, but the reference databases and archives maintained in institutes such as EMBL-EBI or NCBI.
It's easy to forget all the work and effort that went into producing, curating and updating data that you can download or query in a matter of minutes.
See all post in this series https://www.biostars.org/t/uses-this/
To be notified of a new post in the series follow the first post: Jim Robinson of the Integrative Genomics Viewer (IGV) uses this