Herald:The Biostar Herald for Wednesday, July 27, 2022
4 months ago
Biostar 1.3k

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.

This edition of the Herald was brought to you by contribution from Istvan Albert, Rob, and was edited by Istvan Albert,

Benchmarking database systems for Genomic Selection implementation (www.ncbi.nlm.nih.gov)

We selected and benchmarked six popular open-source data storage systems, including relational database management and columnar storage systems.

We found that data extract times are greatly influenced by the orientation in which genotype data is stored in a system. HDF5 consistently performed best, in part because it can more efficiently work with both orientations of the allele matrix.

submitted by: Istvan Albert

Introduction - Polars - User Guide (pola-rs.github.io)

Polars is a blazingly fast DataFrames library implemented in Rust using Apache Arrow Columnar Format as the memory model.

submitted by: Istvan Albert

Comparison of Transformations for Single-Cell RNA-Seq Data | bioRxiv (www.biorxiv.org)

This work provides a comparison of many methods for single-cell count normalization, including approaches based on the delta method, approaches based on model residuals, approaches based on latent expression, and approaches using factor analysis. Perhaps surprisingly, the authors find:

in benchmarks using simulated and real-world data, it turns out that a rather simple approach, namely, the logarithm with a pseudo-count followed by principal component analysis, performs as well or better than the more sophisticated alternatives.

submitted by: Rob

orsum: a Python package for filtering and comparing enrichment analyses using a simple principle | BMC Bioinformatics | Full Text (bmcbioinformatics.biomedcentral.com)

We propose orsum, a Python package to filter enrichment results. orsum can filter multiple enrichment results collectively and highlight common and specific annotation terms. Filtering in orsum is based on a simple principle: a term is discarded if there is a more significant term that annotates at least the same genes; the remaining more significant term becomes the representative term for the discarded term.

submitted by: Istvan Albert

No evidence that synonymous mutations in yeast genes are mostly deleterious | bioRxiv (www.biorxiv.org)

Same data two very different conclusions:

"A re-examination of the data in Shen et al. strongly suggests that it is entirely consistent with the expectation that most nonsynonymous and nearly all synonymous mutations have no detectable effects on fitness."

submitted by: Istvan Albert

lemmi_front (lemmi.ezlab.org)

LEMMI: A Live Evaluation of Computational Methods for Metagenome Investigation, is an online resource and a pipeline dedicated to continuous benchmarking of newly published metagenomics taxonomic classifiers.

submitted by: Istvan Albert

