The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.
This edition of the Herald was brought to you by contribution from Mensur Dlakic, Istvan Albert, lethalfang, and was edited by Istvan Albert,
Achieving robust somatic mutation detection with deep learning models derived from reference data sets of a cancer sample | Genome Biology | Full Text (doi.org)
Explores strategies to build machine learning classifiers with different combinations of training data sets and how it affects the results in test data sets. Tested two machine learning algorithms: NeuSomatic and Octopus.
submitted by: lethalfang
ABRIDGE: An ultra-compression software for SAM alignment files | bioRxiv (www.biorxiv.org)
An ultra-compression software for SAM alignment files
submitted by: Mensur Dlakic
Best practices for variant calling in clinical sequencing | Genome Medicine | Full Text (genomemedicine.biomedcentral.com)
In this review, I discuss the current best practices for variant calling in clinical sequencing studies, with a particular emphasis on trio sequencing for inherited disorders and somatic mutation detection in cancer patients. I describe the relative strengths and weaknesses of panel, exome, and whole-genome sequencing for variant detection.
submitted by: Istvan Albert
Generating lineage-resolved, complete metagenome-assembled genomes from complex microbial communities | Nature Biotechnology (www.nature.com)
Microbial communities might include distinct lineages of closely related organisms that complicate metagenomic assembly and prevent the generation of complete metagenome-assembled genomes (MAGs). Here we show that deep sequencing using long (HiFi) reads combined with Hi-C binning can address this challenge even for complex microbial communities.
submitted by: Istvan Albert
GitHub - sdparekh/zUMIs: zUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs (github.com)
zUMIs is a fast and flexible pipeline to process RNA-seq data with (or without) UMIs.
The input to this pipeline is simply fastq files. In the most common cases, you will have a read containing the cDNA sequence and other read(s) containing UMI and Cell Barcode information. Furthermore, you will need a STAR index for your genome and GTF annotation file.
submitted by: Istvan Albert
What’s the holy grail in computational biology? Only wrong answers.
— Fabian Theis (@fabian_theis) October 27, 2021
What’s the holy grail in computational biology? Only wrong answers.
— Fabian Theis (@fabian_theis) October 27, 2021submitted by: Istvan Albert
Graphical comparison between the standard (linear) #correlation and the Chatterjee's "rank correlation" recently introduced in https://t.co/xJIg336UGL#statistics #probability @johnleibniz @_bakshay https://t.co/P253mweEih pic.twitter.com/BWCeIZIECR
— adad8m (@adad8m) December 25, 2021
Graphical comparison between the standard (linear) #correlation and the Chatterjee's "rank correlation" recently introduced in https://t.co/xJIg336UGL#statistics #probability @johnleibniz @_bakshay https://t.co/P253mweEih pic.twitter.com/BWCeIZIECR
— adad8m (@adad8m) December 25, 2021submitted by: Istvan Albert
[1909.10140] A new coefficient of correlation (arxiv.org)
Is it possible to define a coefficient of correlation which is (a) as simple as the classical coefficients like Pearson's correlation or Spearman's correlation, and yet (b) consistently estimates some simple and interpretable measure of the degree of dependence between the variables, which is 0 if and only if the variables are independent and 1 if and only if one is a measurable function of the other, and (c) has a simple asymptotic theory under the hypothesis of independence, like the classical coefficients? This article answers this question in the affirmative, by producing such a coefficient. No assumptions are needed on the distributions of the variables. There are several coefficients in the literature that converge to 0 if and only if the variables are independent, but none that satisfy any of the other properties mentioned above.
submitted by: Istvan Albert
Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription