Herald:The Biostar Herald for Monday, February 26, 2024
0
4
Entering edit mode
8 weeks ago
Biostar 2.7k

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.

This edition of the Herald was brought to you by contribution from Mensur Dlakic, Istvan Albert, and was edited by Istvan Albert,


submitted by: Istvan Albert


Commonly used software tools produce conflicting and overly-optimistic AUPRC values | bioRxiv (www.biorxiv.org)

Do we trust our tools unconditionally?

submitted by: Mensur Dlakic


Significant Updates Coming to the NCBI Datasets APIs and Command-Line Tools - NCBI Insights (ncbiinsights.ncbi.nlm.nih.gov)

As part of our ongoing effort to enhance your experience, we are updating the NCBI Datasets application programming interfaces (APIs). Beginning in June 2024, the v2alpha APIs will be promoted to the stable v2 version. At this time, the v1 API, the command-line interface (CLI) version 13 and older versions, and the Python library v1 will be deprecated and thus no longer supported for bug fixes or updates. Effective December 31, 2024, these will no longer be available for use.

submitted by: Istvan Albert


submitted by: Istvan Albert


https://academic.oup.com/bioinformatics/article/35/3/421/5055585

General-purpose processors can now contain many dozens of processor cores and support hundreds of simultaneous threads of execution. To make best use of these threads, genomics software must contend with new and subtle computer architecture issues. We discuss some of these and propose methods for improving thread scaling in tools that analyze each read independently, such as read aligners.

We implement these methods in new versions of Bowtie, Bowtie 2 and HISAT. We greatly improve thread scaling in many scenarios, including on the recent Intel Xeon Phi architecture. We also highlight how bottlenecks are exacerbated by variable-record-length file formats like FASTQ and suggest changes that enable superior scaling.

submitted by: Istvan Albert


A high-performance computational workflow to accelerate GATK SNP detection across a 25-genome dataset | BMC Biology | Full Text (bmcbiol.biomedcentral.com)

Here we report an open-source high-performance computing genome variant calling workflow (HPC-GVCW) for GATK that can run on multiple computing platforms from supercomputers to desktop machines. We benchmarked HPC-GVCW on multiple crop species for performance and accuracy with comparable results with previously published reports (using GATK alone). Finally, we used HPC-GVCW in production mode to call SNPs on a “subpopulation aware” 16-genome rice reference panel with ~ 3000 resequenced rice accessions. The entire process took ~ 16 weeks and resulted in the identification of an average of 27.3 M SNPs/genome and the discovery of ~ 2.3 million novel SNPs that were not present in the flagship reference genome for rice (i.e., IRGSP RefSeq).

submitted by: Istvan Albert


submitted by: Istvan Albert


submitted by: Istvan Albert


Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription

herald • 284 views
ADD COMMENT

Login before adding your answer.

Traffic: 2098 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6