Forum:What is the best way to start coding genomics tools?
2
0
Entering edit mode
8.4 years ago
morovatunc ▴ 550

Hello,

I have to do a python semester project and since I wanna be a well developed bioinformatician. I decided to write one of the tools from the scratch. My aim is not discovering the wheel again, I just want to rewrite a part of the tools. However, when I look at the source code of say, platypus, cava etc.. I got overwhelmed and I felt they have already written this code why should I try it. (I am a bioengineering so coding is not my primary skill).

I just wanted to hear your advice for me? Where should I start? I am thinking of writing a indel identification( from the output of alignment) code. Is this a good point to start?

Thank you for your help

Tunc

python indel genomic-tools • 1.6k views
ADD COMMENT
1
Entering edit mode
8.4 years ago
SES 8.6k

In general, I would say don't be afraid to work on a problem just because someone else may have solved it. See this discussion: What Is The Proper Way To Think About Reinventing The Wheel As A Bioinformatician?

There is no downside to trying, you will learn even if you don't fully accomplish what you wanted. Though, for a short term rotation project, I think you should try to use as many tools already available to get the most out of your project. The main reason is that biology is complicated and using a toolkit will help you get to the right answer faster, not just get an answer. Where to look for help depends on what you want to accomplish, but BioPython is the first and obvious place to look (with great documentation). I found it quite easy and fun to learn. You may want to add more details unless you want general programming advice.

ADD COMMENT
0
Entering edit mode

With respect to details, I mean elaborate on what you mean by 'indel identification', for example. From a MSA or genome alignment? That will help people point you in the right direction.

ADD REPLY
0
Entering edit mode

We have WGS data from TGCA(https://tcga-data.nci.nih.gov/tcga/) which are aligned files. For understand what algorithm does, plus making my project, I decided to write a indel identification code.

ADD REPLY
0
Entering edit mode
8.4 years ago
thot • 0

Hi Tunc!

You can find many interesting, bioinformatics related problems on this site: http://rosalind.info/problems/locations/

I think that is a good starting point for you.

Cheers,

Peter

ADD COMMENT

Login before adding your answer.

Traffic: 2339 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6