Project For A Beginner Bioinformatics Student
9
31
Entering edit mode
11.0 years ago
And ▴ 230

I was wondering if someone could suggest an interesting coding project for a beginner bioinformatics student. I am actually a senior computer science major, so my programming skills are pretty good. I have been reading articles here and there but would like to do some hands on projects - just not sure where to start! Any advice is appreciated.

project java • 68k views
9
Entering edit mode

This is a question I hear a lot in workshops, actually. And there are folks out there teaching who could use a repository of little tasks that need to be done that they could put some students on. If anyone knows of one like this--matching small projects + trainees, I'd be interested too.

0
Entering edit mode

I'd be very interested in a repository like Mary suggests as well...

0
Entering edit mode

Thank you everybody for the great answers!

16
Entering edit mode
11.0 years ago

We have some projects (not all are coding) here http://www.bigcat.unimaas.nl/wiki/index.php/Student_projects.

For coding projects you might also want to check out larger open source coding projects like Cytoscape and PathVisio. For PathVisio I know that there are list of possible smaller and larger coding projects collected on the bug tracker.

1
Entering edit mode

+1 for the student project wiki

9
Entering edit mode
10.3 years ago
Nikolay Vyahhi ★ 1.3k

We're developing such a resource, called Rosalind: http://rosalind.info — platform for learning bioinformatics through problem solving.

It is currently in beta and we're improving it every day, so will be happy to hear any feedback.

1
Entering edit mode

Great idea! Looking forward to seeing how things turn out...

6
Entering edit mode
11.0 years ago
Andrew Su 4.9k

First, a general answer: To find a good bioinformatics project, it really helps to be working directly with a card-carrying bioinformatician. That person can be an invaluable adviser for picking an interesting and tractable project that may have real-world applications, and also for identifying the general approach for attacking that problem. If you seek out a molecular biologist who may have bioinformatics needs, it's not clear that you get either of those benefits. So, seek out a bioinformatics mentor!

Second, a selfish answer: Do a project with me! My group builds all sorts of biomedical tools (like BioGPS and the Gene Wiki). We've also recently started experimenting with building games (e.g., Dizeez) to structure biological knowledge. We also have projects focused on data mining in large data sets. There are a million internship-style projects I can think of, so this is an open invitation for any interested student to contact us. There will often be the possibility of being a joint author on a publication! </shameless_plug>

0
Entering edit mode
i want to do a project with you can you please give me your contact info ?
0
Entering edit mode

@Andrew

I am a computer science major and looking for a project for my algorithm class, I am really interested in using crowdsources to solve a biological problem via games etc. Let me know if you are still looking for someone to help you with a project. It has to have some kind of algorithmic application so I can present it in my class too.

0
Entering edit mode

I am interested to do an internship-based project plz give me your contact information.

2
Entering edit mode
11.0 years ago
Jake Mick ▴ 50

There was a discussion for the creation of a Project Mendel in another post. The goal would be to produce a set of problems like project Euler except the problems are biologically relevant. What I would like to see in a project: No use of Bio* libraries. IO considerations. Mining on actual datasets. Multiple answers, but there is a best practice.

Problem 1) Here is a file format specification, implement a parser using only built-ins. ... Problem 30) Develop a scoring metric for the comparison of phylogenetic trees to their manually constructed trees. Problem 31) Develop a set of unsupervised tools to construct phylogenetic trees and evaluate your work. ... Problem 100) Predict x with a Gini coefficient of at least 0.25.

1
Entering edit mode
11.0 years ago

If you want to get your hands dirty in biological data you could try to annotate a VCF file. I.E. what a mutation does to a gene. You could also write some scripts to calculate DN/DS or TI/TV. These would be tools that you would use over and over again.

0
Entering edit mode

I agree with you about dn/ds: the R seqinR package works well, but it was terrible for me to find any standalone soft that was easy to use/compile/install for that purpose...

1
Entering edit mode
11.0 years ago

You could check projects like Biopieces.org, fastx-toolkit http://hannonlab.cshl.edu/fastx_toolkit/ or Qiime to find out if you could optimize some code or add some functionalities.

I have another idea : you could do a kind of wrapper for similarity search tools to make them working on computer clusters (by splitting queries and reassembling results); an example is paracel for Blast, but in open-source of course!

A lot of people (including me) write some "home-made" code. I'm sure a skilled programmer like you could do something more generic and efficient. It could be for Blast, HMMER, USEARCH, and so on...

1
Entering edit mode
11.0 years ago
Olbzn ▴ 180

You could get to talk with your favorite molecular biologist and ask what they would need and use. There are so many coding project that would be useful and applicable to many molecular biologist drowning under data.

1
Entering edit mode
11.0 years ago
Mark Fortner ▴ 10

You might try asking on one of the bio* projects mailing lists (biojava, bioperl, biopython, etc). These projects are always looking for additional help.

1
Entering edit mode
11.0 years ago

The boundlessly innovative Jotun Hein has posted a set of future, current and past computational biology project ideas that he uses for rotation students. You might want to have a look through these: