The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.
This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan Albert,
Gene Updater: a web tool that autocorrects and updates for Excel misidentified gene names | Scientific Reports (www.nature.com)
Bioinformatics publishing at its finest (sarcasm!).
Herein, we developed a web tool with Streamlit that can convert old gene names and dates back into the new gene names recommended by HGNC. The web app is named Gene Updater, which is open source and can be either hosted locally or at https://share.streamlit.io/kuanrongchan/date-to-gene-converter/main/date_gene_tool.py.
submitted by: Istvan Albert
One in seven pathogenic variants can be challenging to detect by NGS: an analysis of 450,000 patients with implications for clinical sensitivity and genetic test implementationhttps://t.co/aFrXqEI1Uh
— George Carvalho (@geovcnt) September 23, 2022
One in seven pathogenic variants can be challenging to detect by NGS: an analysis of 450,000 patients with implications for clinical sensitivity and genetic test implementationhttps://t.co/aFrXqEI1Uh
— George Carvalho (@geovcnt) September 23, 2022submitted by: Istvan Albert
GitHub - cooplab/popgen-notes: Population genetics notes (github.com)
This book was developed from my set of notes for the Population Biology graduate group core class (PBGG) and Undergraduate Population and Quantitative Genetics class (EVE102) at UC Davis.
submitted by: Istvan Albert
GitHub - CSB5/lofreq: LoFreq Star: Sensitive variant calling from sequencing data (github.com)
LoFreq* (i.e. LoFreq version 2) is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It makes full use of base-call qualities and other sources of errors inherent in sequencing (e.g. mapping or base/indel alignment uncertainty), which are usually ignored by other methods or only used for filtering.
submitted by: Istvan Albert
SeqCode: a nomenclatural code for prokaryotes described from sequence data | Nature Microbiology (www.nature.com)
Here we summarize the development of the SeqCode, a code of nomenclature under which genome sequences serve as nomenclatural types. This code enables valid publication of names of prokaryotes based upon isolate genome, metagenome-assembled genome or single-amplified genome sequences.
submitted by: Istvan Albert
Comparison of calling pipelines for whole genome sequencing: an empirical study demonstrating the importance of mapping and alignment | bioRxiv (www.biorxiv.org)
As part of the quality control, we sequenced one genome in a bottle (GIAB) sample 70 times in different runs, and one GIAB trio in triplicate. In this study, we compared the performance of 6 pipelines, involving two mapping and alignment approaches (GATK utilizing BWA-MEM2 2.2.1, and DRAGEN 3.8.4) and three variant calling pipelines (GATK 4.2.4.1, DRAGEN 3.8.4 and DeepVariant 1.1.0).
DRAGEN and DeepVariant performed similarly and both superior to GATK, with slight advantages for DRAGEN for Indels and for DeepVariant for SNVs. The DRAGEN pipeline showed the lowest Mendelian inheritance error fraction for the GIAB trios. Mapping and alignment played a key role in variant calling of WGS, with the DRAGEN substantially outperforming GATK.
submitted by: Istvan Albert
GitHub - brentp/echtvar: echt rapid variant annotation and filtering (github.com)
Why echtvar? One of the first steps after variant-calling in many pipelines is filtering on allele-frequency. This requires annotating with large datasets (for example, gnomad genomes is over 1TB of data). Echtvar uses integer compression, variant encoding and genomic chunking to make this stupid fast.
submitted by: Istvan Albert
Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription