Question

Forum:Advice For Newcomers To The Bioinformatics Field

40

Entering edit mode

10.8 years ago

Medhat 9.7k

What advice would you like to give to newcomers to bioinformatics field? What piece of information would make your life much easier if someone had told you in the beginning of your career in bioinformatics, except having a biostars account :)

advice career • 18k views

ADD COMMENT • link updated 17 months ago by Ram 43k • written 10.8 years ago by Medhat 9.7k

1

Entering edit mode

related: "I want to learn bioinformatics! A guide for complete beginners." by Nick Loman http://pathogenomics.bham.ac.uk/blog/2013/07/i-want-to-learn-bioinformatics-a-guide-for-complete-beginners/

ADD REPLY • link 10.8 years ago by Pierre Lindenbaum 161k

score 31 · Answer 1 · 2013-07-04

There were a few difficulties that I had when I first joined the bioinformatics that I would classify as cultural or psychological:

Sometimes, there is no tool that does what you want. Although you might think it's basic and obvious and of course somebody else must have done this before, sometimes, there just isn't a tool that does what you want. Either it's so simple that everybody always reinvents the wheel since it's just 5 minutes of programming, or each lab has their own version which seemed too unimportant to publish or the approach that seems obvious from the point of view of your research, really is special to just your research (ie. relatively simple to do, but not needed by most). The solution is to google for a few days to get a sense of what's out there. Ask a question on biostars including what you have found so far. Accept that if the community says there is no such tool, there probably isn't. When I was new to bioinformatics, it was hard to say to my wet-lab colleagues that there was no simple tool that did the kind of analysis they wanted, that I would have to write something myself. Quite often they would bring me a paper where the analysis had been done implying the solution must be out there somewhere if I googled hard enough.
Sometimes, there are many, many tools that do what you want and there is no hierarchy or rating. This can be frustrating. Sometimes, you are lucky and there is a ranking of tools in a review paper. Quite often, this review will be a bit out-of-date. The solution is to pick a few that sound good and try them out. Look at how they work with your data. If you are lucky, they give very similar answers. In that case you might go with the one that is most widely used in the journals you want to publish in and is easiest to use. If they give widely different answers, you might have to keep looking at the results of each tool until you have a sense of which one produces believable results in terms of the biology. You might have to do a few experiments as well. Try to come up with bioinformatics tests as well. Eventually, if you work at this long enough, you will get a sense of which tool seems to work best for your data.
Many questions depend highly on your data, and so there is no clear answer to what might seem like an obvious question. I work a lot with sequencing and there are many known biases to sequencing, for instance PCR bias. So, if you tell me that you see anomalies in the GC content of your data, I can immediately say that sometimes there is PCR bias. However, if you ask me what went wrong with your particular sample, it's hard. The amount of detective work involved ends up being almost as much work as a full research project.
Many tools will require huge amounts of work to install and run. If you are used to the type of programs that one encounters as a casual OS X or Windows user, then you are used to downloading software, clicking install and having things work out. However, many of those tools are vastly more polished than the software you will encounter as someone new to bioinformatics. Now installing software might involve multiple steps including installing other software. So you will need to accept the possibility that installing the software you need to do an analysis, might take as long or longer than the analysis itself.
Things will often take a long time to run. Find things to do while your software is running. Read journal articles, answer emails, discuss things with colleagues. Accept that your life probably involves a lot of waiting now.
You will have to be proactive about software support. If the software you are using doesn't work then you might have to email authors, join mailing lists, ask questions on biostars and dig through forums to get the answers. I once had a bit of software that took me more than a year to figure out how to use. I emailed the authors several times and asked questions on biostars and still it took a while.
Sometimes you have to get answers in weird places. I never feel very happy about telling my non-computational colleagues that I found some information on how to use a tool in a online forum like 'biostars.org' or 'seqanswers.com' It doesn't feel like a very solid position to take. It's even worse if the person who wrote the answer is somebody with a name like 'burt5000'. You have to take these moments in stride and test what you have been told. You can only do your best.

score 28 · Answer 2 · 2013-07-04

28

Entering edit mode

10.8 years ago

Pierre Lindenbaum 161k

Install linux

Learn

the fundamentals of Biology
command line
bash
make
the NCBI E-Utils/Biomart

forget about

windows
GUIs
microsoft excel
microsoft excel
did I mention microsoft excel ?

later, learn:

a scripted language
a compiled language
a RCS
put all your new knowledge in a blog

EDIT 2020:

learn nextflow and/or snakemake

ADD COMMENT • link 3.6 years ago by Pierre Lindenbaum 161k

2

Entering edit mode

yes u did mentioned Microsoft :)

ADD REPLY • link 5.8 years ago by Medhat 9.7k

1

Entering edit mode

I couldn't agree more with regards to Microsoft excel. I saved output from wannovar in excel format and later discovered a gene name had been changed to a date. Thanks excel.

ADD REPLY • link 10.8 years ago by Matthieu Miossec ▴ 370

0

Entering edit mode

For the sake of discussion, it would be good to define some of the terms you use. For example, what is a 'Microsoft' anyways? :P

ADD REPLY • link 10.8 years ago by Eric Normandeau 11k

0

Entering edit mode

as far as i can understand Pierre talks about Microsoft excel and windows op

ADD REPLY • link 10.8 years ago by Medhat 9.7k

0

Entering edit mode

I was just joking ;)

ADD REPLY • link 10.8 years ago by Eric Normandeau 11k

Ram · Answer 3 · 2013-07-04

18

Entering edit mode

10.8 years ago

Damian Kao 16k

Use a naming convention for your files.
Never assume anything works 100% correctly out of the box. Always spot check after you run a script/software package. I am still learning this...
Don't get too caught up with the methods and forget the question.
Know your file formats.
Start a blog. You'll find describing what you are doing to an internet audience will allow you to see holes in your work. Also a great way to store code.
There is no perfect data. Sometimes you just have to accept that no amount of massaging the data will make your analysis that much better.

ADD COMMENT • link 10.8 years ago by Damian Kao 16k

2

Entering edit mode

The perfect point:

Don't get too caught up with the methods and forget the question.

ADD REPLY • link updated 17 months ago by Ram 43k • written 8.8 years ago by Medhat 9.7k

Ram · Answer 4 · 2013-07-04

13

Entering edit mode

10.8 years ago

zam.iqbal.genome ★ 1.8k

Learn how to do things in unix (if you can't already). Becoming familiar with the filesystem, shell scripts, downloading, unzipping, installing, compiling.

All these things seem obvious once you've been in the field a while, but they are a barrier to people - once overcome, many things are so much easier!

ADD COMMENT • link updated 17 months ago by Ram 43k • written 10.8 years ago by zam.iqbal.genome ★ 1.8k

Ram · Answer 5 · 2014-04-28

9

Entering edit mode

10.0 years ago

Christian ★ 3.0k

I recently gave a talk at my former university titled "How to be a bioinformatician". You might find some useful slides in here: http://www.slideshare.net/ChristianFrech/how-to-be-a-bioinformatician

ADD COMMENT • link updated 17 months ago by Ram 43k • written 10.0 years ago by Christian ★ 3.0k

0

Entering edit mode

Thanks a lot :) I wish I can give you more than +1

ADD REPLY • link 10.0 years ago by Medhat 9.7k

Ram · Answer 6 · 2013-07-04

8

Entering edit mode

10.8 years ago

Maxime ▴ 80

Another point that I think was forgotten (even if it's in fact related to the blog idea):

Be social, talk to people, talk to biologists, talk to computer scientists, the more point of view you got, the better your own understanding is

ADD COMMENT • link updated 17 months ago by Ram 43k • written 10.8 years ago by Maxime ▴ 80

3

Entering edit mode

funnily, as a professional asocial, I've upvoted your answer :-)

ADD REPLY • link 10.8 years ago by Pierre Lindenbaum 161k

Ram · Answer 7 · 2013-07-04

5

Entering edit mode

10.8 years ago

Ben ★ 2.0k

In addition to the great answers already given:

Set up an RSS reader with some relevant pubmed search terms/authors, some journal/preprint feeds, relevant blogs (Google Reader we hardly knew ye)
Version control everything, not just production scripts but also LaTeX documents, SVGs ..., try to use meaningful commit messages even in private repos
Just because you learnt (e.g.) Python first, doesn't mean you should automatically use matplotlib rather than taking the time to learn R and ggplot2
Master a text editor (preferably Emacs)

ADD COMMENT • link updated 17 months ago by Ram 43k • written 10.8 years ago by Ben ★ 2.0k

2

Entering edit mode

I'm having a hard time resisting the temptation to edit 'Emacs' for 'vi' :)

ADD REPLY • link 10.8 years ago by Eric Normandeau 11k

3

Entering edit mode

My advice: don't get caught up in "software X versus software Y" arguments :)

ADD REPLY • link 10.8 years ago by Neilfws 49k

Ram · Answer 8 · 2017-01-18

1

Entering edit mode

7.3 years ago

Eslam Samir ▴ 110

I would like to thank you all for these precious information. After working in this field, I would like to add "please try to watch some MOOC but you should do it yourself.

After about 3 month I was able to make a program myself which I shared for the public benefit.

Good Luck all

Sequence Database curator.

image: illustration of both approaches

ADD COMMENT • link updated 17 months ago by Ram 43k • written 7.3 years ago by Eslam Samir ▴ 110

Ram · Answer 9 · 2018-07-15

0

Entering edit mode

5.8 years ago

hd1fernando • 0

Maybe you can have your answer here: The Biostar Handbook. A bioinformatics e-book for beginners.

ADD COMMENT • link updated 17 months ago by Ram 43k • written 5.8 years ago by hd1fernando • 0

0

Entering edit mode

This question is 5 years old. It is not about reading book, it is about good practice and things you should learn/do that can't be found in books :)

ADD REPLY • link 5.8 years ago by Medhat 9.7k