Seeking Beginner-Friendly Resources for Deepening Understanding of Bioinformatics Methods Post Biology Undergrad
1
0
Entering edit mode
8 weeks ago
Serij´s • 0

Hello,

I'm interested in delving deeper into the mathematical and theoretical foundations behind methods commonly employed in bioinformatics. For example, while I understand that the limma package utilizes linear regression for differential expression analysis, I seek a comprehensive understanding of the underlying theory and rationale for employing linear regression in this context.

Many bioinformatics resources tend to focus on practical aspects such as handling data files like FASTA and FASTQ, or discussing next-generation sequencing technologies. However, I've found that few delve into the mathematical and theoretical underpinnings of the analysis methods. While some resources provide code and instructions on how to use these methods, they often lack explanations regarding the underlying principles and when one should apply them.

Can you suggest reliable resources providing thorough yet accessible explanations for beginners seeking to deepen their understanding of bioinformatics methods following completion of a biology undergraduate degree?

literature limma • 720 views
ADD COMMENT
1
Entering edit mode

Can you suggest reliable resources providing thorough yet accessible explanations for beginners seeking to deepen their understanding of bioinformatics methods following completion of a biology undergraduate degree?

There is inherent contradiction in asking for thorough yet accessible explanations for beginners. In most cases you will find a thorough explanation in original papers describing the methodology, but not sure that will be accessible enough.

ADD REPLY
2
Entering edit mode
8 weeks ago
Gordon Smyth ★ 7.0k

Given your interest in linear models and limma, I suggest that you start with:
Law CW, Zeglinski K, Dong X, Alhamdoosh M, Smyth GK, Ritchie ME (2020). A guide to creating design matrices for gene expression experiments. F1000Research 9, 1444. https://f1000research.com/articles/9-1444

There is also considerable discussion of linear models for different experimental designs in the limma User's Guide.

If you have RNA-seq data, the limma-voom workflow talks you through an RNA-seq analysis including discussion of the principles along the way: https://bioconductor.org/packages/release/workflows/vignettes/RNAseq123/inst/doc/limmaWorkflow.html

The limma review paper includes a high-level discussion of the principles behind limma:
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research 43, e47. https://doi.org/10.1093/nar/gkv007

I haven't give you links to theoretical mathematics papers because trying to learn all the mathematics behind all commonly used bioinformatics methods is quite unrealistic, and especially so for a sophisticated statistical package like limma. It would be unrealistic even if you had a PhD in mathematics. And it is unnecessary because you don't need to do that in order to understand how and when to apply the packages. On the other hand, starting with the above references on linear models will give you a deeper understanding of how linear models apply to differential abundance analyses and might hopefully serve your purposes.

ADD COMMENT

Login before adding your answer.

Traffic: 2612 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6