The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.
This edition of the Herald was brought to you by contribution from Istvan Albert, clealk, and was edited by Istvan Albert,
Riffomonas (riffomonas.org)
Scientists are generating unprecedented amounts of data and frequently struggle to analyze it in a reproducible manner. With Riffomonas, you can improve your data analysis skills using methods that foster reproducibility.
submitted by: Istvan Albert
This appears to be an amazing trove of data analysis tutorials and practicsl@problems to tackle. Hats off to @PatSchloss for making this available https://t.co/n9AwDUaubS
— Keith Robison (@OmicsOmicsBlog) November 13, 2022
This appears to be an amazing trove of data analysis tutorials and practicsl@problems to tackle. Hats off to @PatSchloss for making this available https://t.co/n9AwDUaubS
— Keith Robison (@OmicsOmicsBlog) November 13, 2022submitted by: Istvan Albert
Systematic tissue annotations of genomics samples by modeling unstructured metadata | Nature Communications (www.nature.com)
There are currently >1.3 million human –omics samples that are publicly available. This valuable resource remains acutely underused because discovering particular samples from this ever-growing data collection remains a significant challenge
We propose a natural-language-processing-based machine learning approach (NLP-ML) to infer tissue and cell-type annotations for genomics samples based only on their free-text metadata.
submitted by: Istvan Albert
A Deep-learning based RNA-seq Germline Variant Caller | bioRxiv (www.biorxiv.org)
Here, we extend DeepVariant, a deep-learning based variant caller, to learn and account for the unique challenges presented by RNA-seq data. Our DeepVariant RNA-seq model produces highly accurate variant calls from RNA-sequencing data, and outperforms existing approaches such as Platypus and GATK. We examine factors that influence accuracy, how our model addresses RNA editing events, and how additional thresholding can be used to facilitate our models’ use in a production pipeline.
submitted by: Istvan Albert
GitHub - marbl/HG002: A complete diploid human genome (github.com)
In collaboration with the Human Pangenome Reference Consortium and the Genome in a Bottle Consortium we have sequenced and assembled the HG002 aka GM24385 aka huAA53E0 cell line. The ultimate goal of this effort is to create a reference assembly for the HG002 reference material that is perfectly accurate.
submitted by: Istvan Albert
The #T2T team is excited to release our first draft of a complete, diploid human genome for the @GenomeInABottle benchmark sample, HG002! Links to the data and assembly can be found here: https://t.co/tYb4oamaVS 🧵
— Adam Phillippy (@aphillippy) November 10, 2022
The #T2T team is excited to release our first draft of a complete, diploid human genome for the @GenomeInABottle benchmark sample, HG002! Links to the data and assembly can be found here: https://t.co/tYb4oamaVS 🧵
— Adam Phillippy (@aphillippy) November 10, 2022submitted by: Istvan Albert
GitHub - koesterlab/datavzrd: A tool to create visual HTML reports from collections of CSV/TSV tables (github.com)
A tool to create visual and interactive HTML reports from collections of CSV/TSV tables. Reports include automatically generated vega-lite histograms per column. Plots can be fully customized by users via a config file. These also allow the user to add linkouts to other websites or link between multiple tables. An example report can be viewed online with the corresponding config file.
submitted by: Istvan Albert
GitHub - agshumate/Liftoff: An accurate GFF3/GTF lift over pipeline (github.com)
Liftoff is a tool that accurately maps annotations in GFF or GTF between assemblies of the same, or closely-related species. Unlike current coordinate lift-over tools which require a pre-generated “chain” file as input, Liftoff is a standalone tool that takes two genome assemblies and a reference annotation as input and outputs an annotation of the target genome. Liftoff uses Minimap2
submitted by: Istvan Albert
Hi all, happy to share GW - our new high-performance browser for genomic sequencing data, written in C++ https://t.co/GRODv6So4d. GW also helps you explore and manually annotate 100s-1000s variants (vcf/bcf files) by viewing images as thumbnails pic.twitter.com/PYbN4kowRc
— kez cleal (@kezcleal) November 7, 2022
Hi all, happy to share GW - our new high-performance browser for genomic sequencing data, written in C++ https://t.co/GRODv6So4d. GW also helps you explore and manually annotate 100s-1000s variants (vcf/bcf files) by viewing images as thumbnails pic.twitter.com/PYbN4kowRc
— kez cleal (@kezcleal) November 7, 2022GW is a new high-performance genome browser and variant annotation tool
submitted by: clealk
Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription
always enjoy the herald, thanks for keeping this up :) gw looks awesome