Blog posts collected by the Biostar aggregator. To follow subscribe to the planet feed
<prev • 2,430 results • page 1 of 98 • next >

You Can Be Impatient Running MInIONs, But Not Feeding Them

written 3 days ago by Omics! Omics! by Keith Robinson

Yes, it's been way too long since I wrote here. Even longer since I did so with any regularity. There was always some list of things draining my time and energy. But I resolved this week to get back on the horse -- and that was even before today's bit of dilithium news. In particular, in one twenty-four hour span three different people remarked on the prolonged hiatus -- a professional contact, a commenter on the blog and finally some very cutting remarks from Draco (aka TNG). And what better way to get going again but to kvetch about Oxford Nanopore's supply chain model?Read more »

New Ensembl motif features

written 5 days ago by Ensembl Blog

In its latest release, Ensembl has completely reviewed its reporting of potential Transcription Factor (TF) binding sites. TF proteins are key players of gene expression regulation that bind to specific DNA regions characterised by approximate sequence patterns, or transcription factor binding motifs (TFBM). These motifs are generally represented as a Position Specific Frequency Matrix, or […]

Where to Find Me At ASHG 2018

written 5 days ago by KidsGenomics

This week, I’ll be traveling to the American Society of Human Genetics Meeting in San Diego, CA. This is a massive meeting, and while I love it, I sometimes find it hard to meet friends, colleagues, and blog readers. So here’s where to find me in San Diego this week. As usual, you can catch […] The post Where to Find Me At ASHG 2018 appeared first on KidsGenomics.

A Unix one-liner to call bacterial variants

written 6 days ago by The Genome Factory

Introduction Variant finding is the generic term for finding differences between two genome sequences. These differences can take many forms, such as SNPs and small INDELs, large changes in DNA content caused by mobile elements, and structural changes like chromosomal inversions. The genomes we want to compare could either be assemblies (complete or draft) or just sequencing reads (FASTQ files). The bulk of microbial variant finding tools focus on small differences (&lt; 20 bp), and work by comparing a FASTQ sample to a assembled genome, typically called the "reference". A common use case is to sequence your isolate of interest, and see how it differs to the type strain in Genbank. Calling SNPs with a one-liner Let us assume we have a reference genome in FASTA format in the REF variable, the paired Illumina FASTQ files in R1 and R2, and the number of CPU cores you want to use in CPUS. Then, the follow "one-liner" will generate a VCF file with no intermediate files. CPUS=4REF=ref.faR1=R1.fastq.gzR2=R2.fastq.gzminimap2 -a -x sr -t "$CPUS" "$REF" "$R1" "$R2" \ | samtools sort -l 0 --threads "$CPUS" \ | bcftools mpileup -Ou -B --min-MQ 60 -f "$REF" - \ | bcftools call -Ou -v -m - \ | bcftools norm -Ou -f "$REF" -d all - \ | bcftools filter -Ov -e 'QUAL&lt;40 || DP&lt;10 || GT!="1/1"' &gt; variants.vcf The reads are aligned to the reference, and sorted by coordinate. Instead of saving the BAM file, we pipe it directly to a series of BCF ...

A Unix one-liner to call bacterial variants

written 6 days ago by The Genome Factory

Introduction Variant finding is the generic term for finding differences between two genome sequences. These differences can take many forms, such as SNPs and small INDELs, large changes in DNA content caused by mobile elements, and structural changes like chromosomal inversions. The genomes we want to compare could either be assemblies (complete or draft) or just sequencing reads (FASTQ files). The bulk of microbial variant finding tools focus on small differences (&lt; 20 bp), and work by comparing a FASTQ sample to a assembled genome, typically called the "reference". A common use case is to sequence your isolate of interest, and see how it differs to the type strain in Genbank. Calling SNPs with a one-liner Let us assume we have a reference genome in FASTA format in the REF variable, the paired Illumina FASTQ files in R1 and R2, and the number of CPU cores you want to use in CPUS. Then, the follow "one-liner" will generate a VCF file with no intermediate files. CPUS=4REF=ref.faR1=R1.fastq.gzR2=R2.fastq.gzminimap2 -a -x sr -t "$CPUS" "$REF" "$R1" "$R2" \ | samtools sort -l 0 --threads "$CPUS" \ | bcftools mpileup -Ou -B --min-MQ 60 -f "$REF" - \ | bcftools call -Ou -v -m - \ | bcftools norm -Ou -f "$REF" -d all - \ | bcftools filter -Ov -e 'QUAL&lt;40 || DP&lt;10 || GT!="1/1"' &gt; variants.vcf The reads are aligned to the reference, and sorted by coordinate. Instead of saving the BAM file, we pipe it directly to a series of BCF ...

Our new joint transcript initiative : The Matched Annotation from the NCBI and EBI (MANE) project

written 8 days ago by Ensembl Blog

This blog post is a joint contribution by Joannella Morales, Jane Loveland, Adam Frankish, Fiona Cunningham and Astrid Gall. We are pleased to introduce the Matched Annotation from the NCBI and EMBL-EBI (MANE) project. This new joint initiative between EMBL-EBI’s Ensembl project and NCBI’s RefSeq project aims to release a genome-wide transcript set that contains one well-supported […]

Ensembl Genomes 41 is out!

written 12 days ago by Ensembl Blog

We’ve just released Ensembl Genomes 94, which includes genomes for Emmer wheat and over 200 new fungi, updated gene trees and host-pathogen interactions from PHI-base. New assemblies and gene annotation Crops If you’re feeling hungry, you’ll be pleased to see two new crops in Ensembl Plants: Emmer wheat (Triticum dicoccoides) and Mung bean (Vigna radiata). […]

Ensembl &amp; Friends at ASHG

written 12 days ago by Ensembl Blog

Excited for ASHG? So are we. You can find several representatives of Ensembl at the conference, as well as some of our close collaborators at the European Bioinformatics Institute (EBI), including GENCODE, the GWAS Catalog, HGNC, the IGSR and the LRG. Read on to find out more about where and when you can see our workshops, […]

Rate changes increase substitutions

written 15 days ago by Bits of DNA by Lior Pachter

Continuous-time Markov chain models for DNA mutations on a phylogenetic tree (e.g. the Jukes-Cantor model, the Kimura models, and more generally models of the Felsenstein hierarchy) have the simple and convenient property of multiplicativity. Specifically, if Q is a rate matrix then the associated substitution matrices are multiplicative in the following sense: . This follows […]

Ensembl 94 is out!

written 17 days ago by Ensembl Blog

The latest version of Ensembl, release 94, is out and have we got some treats for you. As well as GENCODE updates for human and mouse, we’ve also got loads of new fish. Plus, we have brand new transcription factor binding motifs, additional predictors of variant pathogenicity and updated gene tree pipelines. New assemblies and […]

On open peer review

written 19 days ago by The Grand Locus

Among the things that make science unique is the fact that scientists agree on what they say. There can be disagreement, but it is always understood as a temporary state, because either someone will be proven wrong, or new information will eventually reconcile everyone. Agreement is enforced in many ways, but pre-publication peer review is currently the dominant process, and it has been for over a century. It is surprising that so little information is available about the efficiency of the peer review process. For instance, there is barely any justification as to why it is by default anonymous. Even more surprising is that people who express their opinion in this regard do not back it up with empirical evidence, because there is essentially no data. Let me clarify something: I do not have any data to show. But I have been signing my reviews for over seven years and I am happy to share this experience with those who wonder what happens when you do this. How did it start? I was first contacted by editors to review manuscripts at the time Stack Overflow eclipsed nearly all the forums on the Internet. The forums were supposed... Read more on the blog: On open peer review

Koala Genome assembled on AWS

written 21 days ago by Kevin's GATTACA World

Excerpted from AWS blog Five years ago, a research team led by Dr. Rebecca Johnson (Director of the Australian Museum Research Institute) set out to learn more about koala populations, genetics, and diseases. As a biologically unique animal with a limited appetite, maintaining a healthy and genetically diverse population are both key elements of any conservation plan. In addition to characterizing the genetic diversity of koala populations, the team wanted to strengthen Australia’s ability to lead large-scale genome sequencing projects. Inside the Koala GenomeLast month the team published their results in Nature Genetics. Their paper (Adaptation and Conservation Insights from the Koala Genome) identifies the genomic basis for the koala’s unique biology. This work was performed on AWS. The research team used cfnClusterto create multiple clusters, each with 500 to 1000 vCPUs, and running Falcon from Pacific Biosciences. All in all, the team used 3 million EC2 core hours, most of which were EC2 Spot Instances.

Cool things the VEP can do: variant prioritisation with G2P

written 22 days ago by Ensembl Blog

A common use case for the VEP is as a first step towards identifying the causal genetic variant of a rare phenotype from whole genome/exome sequencing. The VEP tells you which genes are hit, what effects they have on them, and you have to begin the long laborious process of filtering those down. Things you […]

Expanding the Phenotype of Spinal Muscular Atrophy

written 4 weeks ago by KidsGenomics

One of the major challenges of rare disease genomics is the availability of patient samples. The disorders being studied are, by definition, rare in the human population, and the variants that cause them (usually) rarer still. When it comes to identifying and establishing truly novel disease genes — genes that have not yet been associated […] The post Expanding the Phenotype of Spinal Muscular Atrophy appeared first on KidsGenomics.

Mathematics matters

written 4 weeks ago by Bits of DNA by Lior Pachter

Six years ago I received an email from a colleague in the mathematics department at UC Berkeley asking me whether he should participate in a study that involved “collecting DNA from the brightest minds in the fields of theoretical physics and mathematics.” I later learned that the codename for the study was “Project Einstein“, an […]

New variant pathogenicity predictors

written 5 weeks ago by Ensembl Blog

Rating variants for their potential deleteriousness is vital for solving the link between genotypes and phenotypes. There are many different algorithms for predicting how likely it is that a human variant would affect the function of a protein, and in release 94 of Ensembl, we’ll be making more of these available. Currently, we have SIFT […]

4th Annual Training Course on Viral Bioinformatics and Genomics (20-24th Aug, 2018)

written 5 weeks ago by Bioinformatics I/O

From 20-24th Aug, we have held our 4th Viral Bioinformatics and Genomics training course at the Garscube campus of the University of Glasgow. In this year, our course had 16 participants attending from academic and health institutions from across the UK, Europe, Canada, Australia, Qatar, and South Africa. The MRC-University of Glasgow Centre for Virus […]

BioBloom tools: fast, accurate and memory-efficient host species sequence screening using bloom filters

written 5 weeks ago by Kevin's GATTACA World

https://academic.oup.com/bioinformatics/article/30/23/3402/207237 Justin Chu Sara Sadeghi Anthony Raymond Shaun D. Jackman Ka Ming NipRichard Mar Hamid Mohamadi Yaron S. Butterfield A. Gordon Robertson Inanç BirolBioinformatics, Volume 30, Issue 23, 1 December 2014, Pages 3402–3404,https://doi.org/10.1093/bioinformatics/btu558Published: 20 August 2014 AbstractLarge datasets can be screened for sequences from a specific organism, quickly and with low memory requirements, by a data structure that supports time- and memory-efficient set membership queries. Bloom filters offer such queries but require that false positives be controlled. We present BioBloom Tools, a Bloom filter-based sequence-screening tool that is faster than BWA, Bowtie 2 (popular alignment algorithms) and FACS (a membership query algorithm). It delivers accuracies comparable with these tools, controls false positives and has low memory requirements.Availability and implementaion:www.bcgsc.ca/platform/bioinfo/software/biobloomtools

Ensembl at AfSHG conference, Rwanda

written 5 weeks ago by Ensembl Blog

We’re excited to be trying a new conference this year: the African Society of Human Genetics (AfSHG) conference in collaboration with H3Africa, in Kigali Rwanda, 19th-21st September. The conference is a fantastic opportunity for African scientists to showcase their work, build collaborations and learn more about their field of research. For us, it’s great to see […]

Cool stuff the VEP can do: installation

written 6 weeks ago by Ensembl Blog

If you don’t want to analyse your variants on external servers or have more than 1000 or so to annotate, you probably want to use the VEP script. Setting it up might not always be straightforward as there are dependencies you need, but the installation script takes away a lot of the trouble. Running the […]

Whole-genome analysis for early infant epilepsy

written 6 weeks ago by KidsGenomics

Early infantile epileptic encephalopathy (EIEE) is a devastating syndrome of intractable seizures that strike in the first months of life. According to Orphanet, it affects 1 in 50,000-100,000 births. Infants with EIEE may suffer hundreds of tonic spasms per day, both during sleep and wakefulness. The prognosis is not good. Most patients die within two […] The post Whole-genome analysis for early infant epilepsy appeared first on KidsGenomics.

Abstract for SIAM: Supporting and Sustaining Open Source Software Development: the Commons Perspective

written 7 weeks ago by Living in an Ivory Basement by Titus Brown

How do we support and sustain open source software development?

A tutorial on t-SNE (1)

written 8 weeks ago by The Grand Locus

In this tutorial, I would like to explain the basic ideas behind t-distributed Stochastic Neighbor Embedding, better known as t-SNE. There are tons of excellent material out there explaining how t-SNE works. Here, I would like to focus on why it works and what makes t-SNE special among data visualization techniques. If you are not comfortable with formulas, you should still be able to understand this post, which is intended to be a gentle introduction to t-SNE. The next post will peek under the hood and delve into the mathematics and the technical detail. Dimensionality reduction One thing we all agree on is that we each have a unique personality. And yet it seems that five character traits are sufficient to sketch the psychological portrait of almost everyone. Surely, such portraits are incomplete, but they capture the most important features to describe someone. The so-called five factor model is a prime example of dimensionality reduction. It represents diverse and complex data with a handful of numbers. The reduced personality model can be used to compare different individuals, give a quick description of someone, find compatible personalities, predict possible behaviors etc. In many... Read more on the blog: A tutorial on t-SNE (1)

Ensembl insights: How are UTRs annotated?

written 9 weeks ago by Ensembl Blog

It’s probably reasonable to assume that the coding sequence (CDS) of a protein-coding transcript model is the feature that is of primary interest to most people who use Ensembl. However, both the 5’ and 3’ untranslated regions (UTRs) are important biological entities in their own right, and it is vital that we in Ensembl do […]

Can bits be the basis for a digital commons? (No.)

written 9 weeks ago by Living in an Ivory Basement by Titus Brown

Bits cannot be the basis for a digital commons, because they are not rivalrous.
<prev • 2,430 results • page 1 of 98 • next >

Planet Feeds

Omics! Omics! by Keith Robinson
A computational biologist's personal views on new technologies & publications on genomics & proteomics and their impact on drug discovery
244 posts, last updated 3 days ago
Ensembl Blog
News about the Ensembl Project and its genome browser
193 posts, last updated 5 days ago
KidsGenomics
KidsGenomics focuses on genetic diseases that affect children, including rare inherited disorders and pediatric cancers.
9 posts, last updated 5 days ago
The Genome Factory
Bioinformatics tips, tricks, tools and commentary with a microbial genomics bent. Written by Torsten Seemann from Melbourne, Australia.
26 posts, last updated 6 days ago
The Genome Factory
Bioinformatics tips, tricks, tools and commentary - all with a microbiological NGS bent. Authored by Dr Torsten Seemann from Melbourne, Australia.
30 posts, last updated 6 days ago
Bits of DNA by Lior Pachter
Reviews and commentary on computational biology
89 posts, last updated 15 days ago
The Grand Locus
My name is Guillaume Filion. I am a scientist who loves biology and mathematics. As of late I also got into computers and the Internet. I intend my blog to be recreational, and not academic nor educational. I wish you will find some of the posts inspiring for your own reflection.
30 posts, last updated 19 days ago
Kevin's GATTACA World
Weblog on Bioinformatics, Genome Science and Next Generation Sequencing
71 posts, last updated 21 days ago
Bioinformatics I/O
Tips && tricks from a cluster of bioinformaticians
18 posts, last updated 5 weeks ago
Living in an Ivory Basement by Titus Brown
bioinformatics education, metagenomics assembly, python programming
239 posts, last updated 7 weeks ago
Inside UniProt
News and commentary from the UniProt developers
36 posts, last updated 9 weeks ago
What You're Doing Is Rather Desperate by Neil Saunders
Notes from the life of a computational biologist
104 posts, last updated 3 months ago
Opinionomics by Mick Watson
bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"
46 posts, last updated 3 months ago
miRBase blog
miRBase news and views
15 posts, last updated 4 months ago
Diving into Genetics and Genomics
A wet lab biologist' bioinformatic notes. Mostly is about Linux, R, python, reproducible research, open science and NGS. I am into data science! I am working on glioblastoma (a terrible brain cancer) genomics at MD Anderson cancer center. Disclaimer: For posts that I copied from other places, credits go to the original authors.
51 posts, last updated 5 months ago
Next Gen Seek
Making Sense of Next-Gen Sequencing Data
88 posts, last updated 5 months ago
MassGenomics by Dan Koboldt
Medical genomics in the post-genome era
71 posts, last updated 7 months ago
Bits of Bioinformatics by Páll Melsted
Assistant professor of computer science at University of Iceland.
9 posts, last updated 8 months ago
In between lines of code by Lex Nederbragt
Biology, sequencing, bioinformatics and more
32 posts, last updated 10 months ago
The OpenHelix Blog
A news portal with postings about genomics resources, genomics news and research, science and more.
391 posts, last updated 13 months ago
BioinfoBlog.it
This blog is written by Giovanni M. Dall’Olio, a research associate at the Cancer Evolutionary Genomics‘s group of Francesca Ciccarelli at the King’s College of London. My primary interests are in the system biology of cancer and in identifying new potential drug targets for this disease.
13 posts, last updated 15 months ago
thoughts about ...
My worklog on bioinformatics, science and research. Small tasks and cute tricks included :)
34 posts, last updated 17 months ago
Getting Genetics Done by Stephen Turner
Getting Things Done in Genetics & Bioinformatics Research
60 posts, last updated 20 months ago
YOKOFAKUN by Pierre Lindenbaum
virology, bioinformatics, genetics, science, java
65 posts, last updated 21 months ago
Bioinformatician at large by Ewan Birney
Thoughts and opinions from the associate director of the EMBL-European Bioinformatics Institute
57 posts, last updated 22 months ago
Homolog.us - Bioinformatics by Manoj Samanta
Frontier in Bioinformatics
185 posts, last updated 2.5 years ago
Blue Collar Bioinformatics by Brad Chapman
bioinformatics, biopython, genomic analysis
38 posts, last updated 2.5 years ago
opiniomics by Mick Watson
bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"
78 posts, last updated 2.7 years ago
Bergman Lab
21 posts, last updated 3.9 years ago
Genomes Unzipped
A group blog providing expert, independent commentary on the personal genomics industry.
34 posts, last updated 4.0 years ago
Bio and Geo Informatics by Brent Pedersen
Genomics Programming
25 posts, last updated 4.9 years ago
Jermdemo Raised to the Law by Jeremy Leipzig
Mostly bioinformatics, NGS, and cat litter box reviews
25 posts, last updated 5.1 years ago

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 624 users visited in the last hour