Herald:The Biostar Herald for Monday, September 12, 2022
Entering edit mode
20 months ago
Biostar 2.8k

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.

This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan Albert,

GitHub - lh3/miniprot: Aligning proteins to genomes with splicing and frameshift (github.com)

Miniprot aligns a protein sequence against a genome with affine gap penalty, splicing and frameshift. It is primarily intended for annotating protein-coding genes in a new species using known genes from other species. Miniprot is similar to GeneWise and Exonerate in functionality but it can map proteins to whole genomes and is much faster at the residue alignment step.

submitted by: Istvan Albert

Without appropriate metadata, data-sharing mandates are pointless (www.nature.com)

Funders and investigators must demand appropriate metadata standards to take data from foul to FAIR.

submitted by: Istvan Albert

submitted by: Istvan Albert

GitHub - oschwengers/bakta: Rapid & standardized annotation of bacterial genomes, MAGs & plasmids (github.com)

Bakta is a tool for the rapid & standardized annotation of bacterial genomes and plasmids from both isolates and MAGs. It provides dbxref-rich and sORF-including annotations in machine-readable JSON & bioinformatics standard file formats for automatic downstream analysis.

submitted by: Istvan Albert

submitted by: Istvan Albert

Taxonomic classification of DNA sequences beyond sequence similarity using deep neural networks (www.pnas.org)

The correct assignment of DNA sequences to their origin is an important task. However, only a fraction of all species are available in today’s databases and thus easily assignable. Therefore, we present a method that is particularly good at classifying sequences for which there are no closely related species in databases. For this purpose, we use a deep learning approach to learn, at first, the “language” of DNA to subsequently distinguish the “language” structure of different groups of organisms, for example, bacteria and viruses. Using this approach, we achieve comparable quality to previous methods for sequences with close relatives in the database and superior quality for new species.

submitted by: Istvan Albert

Recommendations for whole genome sequencing in diagnostics for rare diseases | European Journal of Human Genetics (www.nature.com)

The aim of these recommendations is primarily to list the points to consider for clinical (laboratory) geneticists, bioinformaticians, and (non-)geneticists, to provide technical advice, aid clinical decision-making and the reporting of the results

submitted by: Istvan Albert

Genome-wide prediction of disease variants with a deep protein language model | bioRxiv (www.biorxiv.org)

We developed a modified ESM1b workflow and functionalized, for the first time, all proteins in the human genome, resulting in predictions for all ∼450M possible missense variant effects. ESM1b was able to distinguish between pathogenic and benign variants across ∼150K variants annotated in ClinVar and HGMD, outperforming existing state-of-the-art methods.

submitted by: Istvan Albert

Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription

herald • 674 views

Login before adding your answer.

Traffic: 2172 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6