Question

Herald:The Biostar Herald for Monday, September 12, 2022

1

Entering edit mode

2.1 years ago

Biostar 3.0k

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.

This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan Albert,

GitHub - lh3/miniprot: Aligning proteins to genomes with splicing and frameshift (github.com)

Miniprot aligns a protein sequence against a genome with affine gap penalty, splicing and frameshift. It is primarily intended for annotating protein-coding genes in a new species using known genes from other species. Miniprot is similar to GeneWise and Exonerate in functionality but it can map proteins to whole genomes and is much faster at the residue alignment step.

submitted by: Istvan Albert

Without appropriate metadata, data-sharing mandates are pointless (www.nature.com)

Funders and investigators must demand appropriate metadata standards to take data from foul to FAIR.

submitted by: Istvan Albert

Bakta is a worthy successor to Prokka, which has been neglected in recent years. Bakta is being actively developed with new features that Prokka never had, so I encourage Prokka users to try it out! https://t.co/Lt6MnDDAYw
— Torsten Seemann (@torstenseemann) September 1, 2022

submitted by: Istvan Albert

GitHub - oschwengers/bakta: Rapid & standardized annotation of bacterial genomes, MAGs & plasmids (github.com)

Bakta is a tool for the rapid & standardized annotation of bacterial genomes and plasmids from both isolates and MAGs. It provides dbxref-rich and sORF-including annotations in machine-readable JSON & bioinformatics standard file formats for automatic downstream analysis.

submitted by: Istvan Albert

the aversion to make bioinformatics tools in biology accessible to those with perhaps the most context (bench scientists) is shocking

you really shouldn't be able to publish software without a graphical interface, reproducible build instructions and a versioned release https://t.co/SCr9A4OXoL
— Kenny Workman (@kenbwork) September 5, 2022

submitted by: Istvan Albert

Taxonomic classification of DNA sequences beyond sequence similarity using deep neural networks (www.pnas.org)

The correct assignment of DNA sequences to their origin is an important task. However, only a fraction of all species are available in today’s databases and thus easily assignable. Therefore, we present a method that is particularly good at classifying sequences for which there are no closely related species in databases. For this purpose, we use a deep learning approach to learn, at first, the “language” of DNA to subsequently distinguish the “language” structure of different groups of organisms, for example, bacteria and viruses. Using this approach, we achieve comparable quality to previous methods for sequences with close relatives in the database and superior quality for new species.

submitted by: Istvan Albert

Recommendations for whole genome sequencing in diagnostics for rare diseases | European Journal of Human Genetics (www.nature.com)

The aim of these recommendations is primarily to list the points to consider for clinical (laboratory) geneticists, bioinformaticians, and (non-)geneticists, to provide technical advice, aid clinical decision-making and the reporting of the results

submitted by: Istvan Albert

Genome-wide prediction of disease variants with a deep protein language model | bioRxiv (www.biorxiv.org)

We developed a modified ESM1b workflow and functionalized, for the first time, all proteins in the human genome, resulting in predictions for all ∼450M possible missense variant effects. ESM1b was able to distinguish between pathogenic and benign variants across ∼150K variants annotated in ClinVar and HGMD, outperforming existing state-of-the-art methods.

submitted by: Istvan Albert

Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription

herald • 741 views

ADD COMMENT • link 2.1 years ago by Biostar 3.0k

GitHub - lh3/miniprot: Aligning proteins to genomes with splicing and frameshift (github.com)

Without appropriate metadata, data-sharing mandates are pointless (www.nature.com)

Bakta is a worthy successor to Prokka, which has been neglected in recent years. Bakta is being actively developed with new features that Prokka never had, so I encourage Prokka users to try it out! https://t.co/Lt6MnDDAYw— Torsten Seemann (@torstenseemann) September 1, 2022

GitHub - oschwengers/bakta: Rapid & standardized annotation of bacterial genomes, MAGs & plasmids (github.com)

Taxonomic classification of DNA sequences beyond sequence similarity using deep neural networks (www.pnas.org)

Recommendations for whole genome sequencing in diagnostics for rare diseases | European Journal of Human Genetics (www.nature.com)

Genome-wide prediction of disease variants with a deep protein language model | bioRxiv (www.biorxiv.org)

Bakta is a worthy successor to Prokka, which has been neglected in recent years. Bakta is being actively developed with new features that Prokka never had, so I encourage Prokka users to try it out! https://t.co/Lt6MnDDAYw
— Torsten Seemann (@torstenseemann) September 1, 2022