The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.
This edition of the Herald was brought to you by contribution from audric.cologne, Istvan Albert, and was edited by Istvan Albert,
GitHub - HadrienG/InSilicoSeq: A sequencing simulator (github.com)
InSilicoSeq is a sequencing simulator producing realistic Illumina reads. Primarily intended for simulating metagenomic samples, it can also be used to produce sequencing data from a single genome.
InSilicoSeq is written in python, and use kernel density estimators to model the read quality of real sequencing data.
InSilicoSeq support substitution, insertion and deletion errors. If you don't have the use for insertion and deletion error a basic error model is provided.
submitted by: Istvan Albert
GitHub - RasmussenLab/vamb: Variational autoencoder for metagenomic binning (github.com)
Vamb is a metagenomic binner which feeds sequence composition information from a contig catalogue and co-abundance information from BAM files into a variational autoencoder and clusters the latent representation. It performs excellently with multiple samples, and pretty good on single-sample data. Vamb is implemented purely in Python (with a little bit of Cython) and can be used both from command line and from within a Python interpreter.
submitted by: Istvan Albert
When you're starting with #bioinformatics, you will spend 3 hours writing a script for a task that could have been completed in 30 minutes with an appropriate existing tool. Later, you will spend 3 hours searching for a tool to avoid 30 minutes writing an appropriate script.
— Monika Cechova Mich (@biomonika) March 24, 2022
When you're starting with #bioinformatics, you will spend 3 hours writing a script for a task that could have been completed in 30 minutes with an appropriate existing tool. Later, you will spend 3 hours searching for a tool to avoid 30 minutes writing an appropriate script.
— Monika Cechova Mich (@biomonika) March 24, 2022submitted by: Istvan Albert
ggd consists of: — GGD documentation (gogetdata.github.io)
Go Get Data (ggd) is a data management system that provides access to data packages containing auto curated genomic data. ggd data packages contain all necessary information for data extraction, handling, and processing.
submitted by: Istvan Albert
Official KisSplice Docker. Analysis of alternative splicing events from RNA-seq data. (hub.docker.com)
Creation of a Docker image for KisSplice, KisSplice2RefGenome and kissDE, using the most up-to-date version of each software (22/03/2022). This should higly simplify the use of KisSplice-related software. KisSplice is an annotation-free local assembler dedicated to SNP or Alternative Splicing Events detection and quantification from RNA-seq data.
submitted by: audric.cologne
GitHub - fritzsedlazeck/Sniffles: Structural variation caller using third generation sequencing (github.com)
A fast structural variant caller for long-read sequencing, Sniffles2 accurately detect SVs on germline, somatic and population-level for PacBio and Oxford Nanopore read data.
submitted by: Istvan Albert
GitHub - ReeceGoding/Frustration-One-Year-With-R: An extremely long review of R. (github.com)
What follows is an account of my experiences from about one year of roughly daily R usage. It started out as a list of things that I liked and disliked about the language, but eventually grew to be something huge. [..] This isn’t an attack on R or a pitch for anything else. It is only an account of what I’ve found to be right and wrong with the language. Although the length of my list of what is wrong far exceeds that of what is right, that may be my failing rather than R’s. I suspect that my list of what R does right will grow as I learn other languages and begin to miss some of R’s benefits.
submitted by: Istvan Albert
Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription