Hi, everyone, I have a question as the beginner in Bioinformatics and Computational genomics which language choose, I knew the basics of python and its syntax, when I resources to learn the RNA seq analysis, machine learning in Bioinformatics or in computational genomics, I not found much videos on YouTube in term of python but there is a a lot of resources in R.
I assume here you mean the usefulness for common tasks such as file manipulation and pipeline creations. How would one do sequence assembly, RNA-seq and machine learning in Bash?
How would one do those things without Bash? I certainly wouldn't use R for assembly or alignment. I imagine machine learning could be done equally well in any of those since there are modules for popular machine learning backends... R is good for statistical analysis, but you have to somehow generate the data before you can analyze it. And you can always use gnuplot for common visualizations.
Anyway, I didn't really mean that one should master Bash and do everything in it (personally I try to avoid it as much as possible), but in my opinion the one for which a complete lack of knowledge would be most crippling in bioinformatics would be Bash, so people should at least learn enough to be familiar before learning languages that are more powerful, but totally optional.
I think you may be using system shell and command line interchangeably. I don't use Bash at all - Cshell is my poison of choice - and somehow I do Bioinformatics without a hitch. Also, gnuplot is not a part of Bash or any other Linux shell - it is an independent program.
Assuming you meant that command line proficiency is required for Bioinformatics, we are in agreement. Yet the question was about languages.
There is no single language that is best suited for all areas of Bioinformatics. If one's main interest is RNA-seq, R will probably be more useful. Python would be better choice for deep learning. Machine learning (excluding DL) can be done well in either of the two languages.
I would recommend that you stay with the language you already know (Python), and learn basics of the other language (R) when you encounter specific problems where it works best. I have been doing bioinformatics for 20+ years mostly with Python and with enough R knowledge so I can read its code and lightly modify it when needed. Also know successful scientists whose main language is R and they know only enough Python to get by. The latter group is a minority in my experience, but that's not necessarily a vote for Python over R. Most people interact with others who have similar background and experience.
Don't forget Nextflow or Snakemake for chaining together either yours or other peoples tools. Those two are really the heart of automated and reproducible bioinformatics these days, and there are thousands of available example pipelines.
They work very well with Python and Bash too. Sed and awk are very, very handy, as is Bard or Chatgpt for helping you out with the syntax for these.
For custom analyses and writing your own tools, python and R are both well suited and, indeed, required.
I would say every scientist needs to know R - so if you don't know R it would serve you well to learn it.
In my opinion Python is not well suited as the primary language for bioinformatics data analysis. Sooner or later you will need to use R to make use of existing functionality.
It depends on specifically what you will be doing. But IMO the order of importance is Bash > Python > R.
I assume here you mean the usefulness for common tasks such as file manipulation and pipeline creations. How would one do sequence assembly, RNA-seq and machine learning in Bash?
awk is turing complete /s
How would one do those things without Bash? I certainly wouldn't use R for assembly or alignment. I imagine machine learning could be done equally well in any of those since there are modules for popular machine learning backends... R is good for statistical analysis, but you have to somehow generate the data before you can analyze it. And you can always use gnuplot for common visualizations.
Anyway, I didn't really mean that one should master Bash and do everything in it (personally I try to avoid it as much as possible), but in my opinion the one for which a complete lack of knowledge would be most crippling in bioinformatics would be Bash, so people should at least learn enough to be familiar before learning languages that are more powerful, but totally optional.
I think you may be using system shell and command line interchangeably. I don't use Bash at all - Cshell is my poison of choice - and somehow I do Bioinformatics without a hitch. Also, gnuplot is not a part of Bash or any other Linux shell - it is an independent program.
Assuming you meant that command line proficiency is required for Bioinformatics, we are in agreement. Yet the question was about languages.