The Biostar Herald for Monday, August 22, 2022
3 months ago
The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.

This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan Albert,

Nine out of ten samples were mistakenly switched by The Orang-utan Genome Consortium | Scientific Data (www.nature.com)

Here, we report that the original sequencing Consortium inadvertently switched nine of the ten samples and/or resulting re-sequenced genomes, erroneously attributing eight of these to the wrong source individuals. Among them is a genome from the recently identified Tapanuli (P. tapanuliensis) species: thus, this genome was sequenced and published a full six years prior to the species’ description. Sex was wrongly assigned to five known individuals; the numbers in one sample identifier were swapped; and the identifier for another sample most closely resembles that of a sample from another individual entirely. These errors have been reproduced in countless subsequent manuscripts, with noted implications for studies reliant on data from known individuals.

submitted by: Istvan Albert


We provide FastRemap, a fast and efficient tool for remapping reads between genome assemblies. FastRemap provides up to a 7.19× speedup (5.97×, on average) and uses as low as 61.7% (80.7%, on average) of the peak memory consumption compared to the state-of-the-art remapping tool, CrossMap

submitted by: Istvan Albert

submitted by: Istvan Albert

submitted by: Istvan Albert

Genome-wide somatic variant calling using localized colored de Bruijn graphs | Communications Biology (www.nature.com)

Here we present Lancet, an accurate and sensitive somatic variant caller, which detects SNVs and indels by jointly analyzing reads from tumor and matched normal samples using colored de Bruijn graphs. We demonstrate, through extensive experimental comparison on synthetic and real whole-genome sequencing datasets, that Lancet has better accuracy, especially for indel detection, than widely used somatic callers, such as MuTect, MuTect2, LoFreq, Strelka, and Strelka2.

submitted by: Istvan Albert

GitHub - bedops/bedops: BEDOPS: high-performance genomic feature operations (github.com)

BEDOPS v2.4.41 is a suite of tools to address common questions raised in genomic studies — mostly with regard to overlap and proximity relationships between data sets. It aims to be scalable and flexible, facilitating the efficient and accurate analysis and management of large-scale genomic data.

submitted by: Istvan Albert

Pisces: an accurate and versatile variant caller for somatic and germline next-generation sequencing data - PubMed (pubmed.ncbi.nlm.nih.gov)

We have developed Pisces, a rapid, versatile and accurate small-variant calling suite designed for somatic and germline amplicon sequencing applications. Accuracy is achieved by four distinct modules, each incorporating a number of novel algorithmic strategies.

The Pices variant caller has been developed by Illumina and is used extensively in commercial applications.

submitted by: Istvan Albert

