Which language use as a beginner in Bioinformatics and Computational genomics
3
1
Entering edit mode
11 months ago
2021ag7698 ▴ 10

Hi, everyone, I have a question as the beginner in Bioinformatics and Computational genomics which language choose, I knew the basics of python and its syntax, when I resources to learn the RNA seq analysis, machine learning in Bioinformatics or in computational genomics, I not found much videos on YouTube in term of python but there is a a lot of resources in R.

Please help me, I am thankful for that community

Computational-genomics • 1.2k views
ADD COMMENT
0
Entering edit mode

It depends on specifically what you will be doing. But IMO the order of importance is Bash > Python > R.

ADD REPLY
0
Entering edit mode

I assume here you mean the usefulness for common tasks such as file manipulation and pipeline creations. How would one do sequence assembly, RNA-seq and machine learning in Bash?

ADD REPLY
0
Entering edit mode

awk is turing complete /s

ADD REPLY
0
Entering edit mode

How would one do those things without Bash? I certainly wouldn't use R for assembly or alignment. I imagine machine learning could be done equally well in any of those since there are modules for popular machine learning backends... R is good for statistical analysis, but you have to somehow generate the data before you can analyze it. And you can always use gnuplot for common visualizations.

Anyway, I didn't really mean that one should master Bash and do everything in it (personally I try to avoid it as much as possible), but in my opinion the one for which a complete lack of knowledge would be most crippling in bioinformatics would be Bash, so people should at least learn enough to be familiar before learning languages that are more powerful, but totally optional.

ADD REPLY
0
Entering edit mode

How would one do those things without Bash?

I think you may be using system shell and command line interchangeably. I don't use Bash at all - Cshell is my poison of choice - and somehow I do Bioinformatics without a hitch. Also, gnuplot is not a part of Bash or any other Linux shell - it is an independent program.

Assuming you meant that command line proficiency is required for Bioinformatics, we are in agreement. Yet the question was about languages.

ADD REPLY
4
Entering edit mode
11 months ago
Mensur Dlakic ★ 28k

There is no single language that is best suited for all areas of Bioinformatics. If one's main interest is RNA-seq, R will probably be more useful. Python would be better choice for deep learning. Machine learning (excluding DL) can be done well in either of the two languages.

I would recommend that you stay with the language you already know (Python), and learn basics of the other language (R) when you encounter specific problems where it works best. I have been doing bioinformatics for 20+ years mostly with Python and with enough R knowledge so I can read its code and lightly modify it when needed. Also know successful scientists whose main language is R and they know only enough Python to get by. The latter group is a minority in my experience, but that's not necessarily a vote for Python over R. Most people interact with others who have similar background and experience.

ADD COMMENT
0
Entering edit mode

Thanks for helping me.

ADD REPLY
4
Entering edit mode
11 months ago

Don't forget Nextflow or Snakemake for chaining together either yours or other peoples tools. Those two are really the heart of automated and reproducible bioinformatics these days, and there are thousands of available example pipelines.

They work very well with Python and Bash too. Sed and awk are very, very handy, as is Bard or Chatgpt for helping you out with the syntax for these.

For custom analyses and writing your own tools, python and R are both well suited and, indeed, required.

ADD COMMENT
0
Entering edit mode
11 months ago

I would say every scientist needs to know R - so if you don't know R it would serve you well to learn it.

In my opinion Python is not well suited as the primary language for bioinformatics data analysis. Sooner or later you will need to use R to make use of existing functionality.

ADD COMMENT

Login before adding your answer.

Traffic: 792 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6