Changes to FTP directory layout in Ensembl Genomes 43 / Ensembl 96

written 21 hours ago by Ensembl Blog

We will make changes to the directory layouts of both the Ensembl Genomes FTP server ( and the Ensembl GRCh37 FTP server ( that may affect your pipelines. These changes will come into effect in Ensembl Genomes release 43/Ensembl release 96, which are scheduled for April 2019. Here are the details, so that you can plan […]

Genomics gets global: Ensembl and the VGP

written 3 days ago by Ensembl Blog

As the community’s capacity for genome sequencing expands, so do its ambitions. Recently, many exciting global genomics projects have been launched, including the Vertebrate Genomes Project (VGP), Darwin Tree of Life (DToL), Earth Biogenome Project EBP, i5K (insects) and 10KP (plants). Between them, they aim to sequence the genomes of every eukaryote on Earth, and Ensembl are […]

Ensembl 2020: pre-alpha release

written 5 days ago by Ensembl Blog

If you’ve been at VizBi 2019, you’ll have seen Andy Yates previewing our new website design. If not, here’s your chance to take a look. Our pre-alpha release is out at and we want your feedback. This is a very very early stage version of the site, with limited functionality, just a browser of […]

Coming soon! MANE Select v0.5

written 6 days ago by Ensembl Blog

Joannella Morales, Jane Loveland and Adam Frankish contributed to this post. Back in October, we introduced you to our new joint initiative with the NCBI — the Matched Annotation from the NCBI and EMBL-EBI (MANE) transcript set. We are now pleased to update you on our progress so far. The goal of this project is […]

This is not normal(ised)

written 7 days ago by What You're Doing Is Rather Desperate by Neil Saunders

“Sydney stations where commuters fall through gaps, get stuck in lifts” blares the headline. The story tells us that: Central Station, the city’s busiest, topped the list last year with about 54 people falling through gaps Wow! Wait a minute… Central Station, the city’s busiest Some poking around in the NSW Transport Open Data portal … Continue reading This is not normal(ised)

Joint REST server for Ensembl and Ensembl Genomes in Ensembl 96

written 10 days ago by Ensembl Blog

As of Ensembl release 96/Ensembl Genomes release 43, we will retire and invite you to use instead. Any queries to will return a 404 error, inviting you to replace this URL with While we appreciate that this change will require you to update existing scripts and pipelines, we believe this is […]

Removal of database patches script from Ensembl repository in Ensembl 96

written 11 days ago by Ensembl Blog

In the next release of Ensembl (Ensembl 96) we will remove our database patches script from the main Ensembl repository. There is now a dedicated module using the EBI OLS service to load Ensembl required ontologies. Considering this module is now in charge of loading the required data, the previous databases patches have been moved to the […]

Using parameters in Rmarkdown

written 14 days ago by What You're Doing Is Rather Desperate by Neil Saunders

Nothing new or original here, just something that I learned about quite recently that may be useful for others. One of my more “popular” code repositories, judging by Twitter, is – well, Twitter. It mostly contains Rmarkdown reports which summarise meetings and conferences by analysing usage of their associated Twitter hashtags. The reports follow a … Continue reading Using parameters in Rmarkdown

Getting to know us: Guy from Ensembl Plants

written 17 days ago by Ensembl Blog

Today we are meeting Guy, who works in the Plants team of Ensembl Genomes. He talks about how he came to Ensembl, his interests and experiences so far. What is your job in Ensembl? I work in the Ensembl Plants team, which is like regular Ensembl only for (you guessed it) plants. A lot of […]

Sustaining open source: thinking about communities of effort

written 17 days ago by Living in an Ivory Basement by Titus Brown

Thinking about how to sustain open source.

Modern Genomics Strategies for Rare Diseases

written 18 days ago by KidsGenomics

February 28th is International Rare Disease Day, and it reminded me that I’ve been rather sporadic in updating this blog. That’s mostly because I’ve been working on rare disease analyses, publications, and grants. However, it’s motivated me to share a bit about my strategy for studying rare diseases with genomics and how my thinking has […] The post Modern Genomics Strategies for Rare Diseases appeared first on KidsGenomics.

My recent reading re sustaining open communities

written 18 days ago by Living in an Ivory Basement by Titus Brown

What has Titus been reading lately?

Custom data upload: creating URLs for large files

written 19 days ago by Ensembl Blog

Did you know you can upload your own data for display alongside the reference genomes in Ensembl? For some file types, and files larger than 20MB in size you will need to create a URL to attach the data, rather than uploading from your local directory. It’s not difficult to create these URLs, but there […]

Job: Ensembl Infrastructure Project Leader

written 19 days ago by Ensembl Blog

We’re looking for a software development manager to lead our infrastructure team, maintaining our database, API infrastructure and internal genome analysis tools. We’re looking for MScs, PhDs or equivalent in Computational, Physical or Biological Sciences with experience developing APIs, communicating technical information, software development and working with large datasets. Closes 10th April. Location: EMBL-EBI, Hinxton near […]

Cool stuff the Ensembl VEP can do: take a REST

written 24 days ago by Ensembl Blog

The VEP can work as an offline or a web tool and it’s also available as REST service. Perfect for integrating into pipelines or displaying data on the web, the REST API VEP endpoints can take input as HGVS, genomic loci or variant identifiers and can interpret common forms of non-standard HGVS. They are all […]

Beyond Generations: My Vocabulary for Sequencing Tech

written 25 days ago by Omics! Omics! by Keith Robinson

Many writers have attempted to divide Next Generation Sequencing into Second Generation Sequencing and Third Generation Sequencing. Personally, I think it isn't helpful and just confuses matters. I'm not the biggest fan of Next Generation Sequencing (NGS) to start with, as like "post-modern architecture" (or heck, "modern architecture") it isn't future-proofed. Not that I wouldn't take a job with NGS in the title, but still not a favorite. High Throughput Sequencing feels a little better, but again doesn't leave room for distinguishing growth -- and HTS as an abbreviation is already going to confuse anyone in Biopharma who thinks about High Throughput Screening. Massively Parallel Sequencing sort of works, but my late father had a real pedantic objection to using "massive" for anything that lacked mass, and while I don't subscribe to that view such uses just don't sit well with me. Worse, as I'll explain, trying to divide sequencer technologies into Second and Third generations creates more heat and smoke than light. On a number of Twitter threads I've tried to launch my own terminology, but probably haven't been terribly consistent. So here is an attempt at that.Read more »

Job: Applications software developer

written 25 days ago by Ensembl Blog

We’re looking for a software developer to join our Applications team, working on web applications and our REST APIs. We’re looking for MScs in Computer Science with experience working in development using Git, Python, RDBMSs, Agile development and REST APIs. Closes 2nd April. Location: EMBL-EBI, Hinxton near Cambridge, UK Staff Category: Staff Member Contract Duration: 3 […]

Threat models for open online scientific engagement?

written 25 days ago by Living in an Ivory Basement by Titus Brown

What threats are there for scientists in engaging in open online discussions?

Introduction to single-cell RNA-seq technologies

written 27 days ago by Bits of DNA by Lior Pachter

Bi/BE/CS183 is a computational biology class at Caltech with a mix of undergraduate and graduate students. Matt Thomson and I are co-teaching the class this quarter with help from teaching assistants Eduardo Beltrame, Dongyi (Lambda) Lu and Jialong Jiang. The class has a focus on the computational biology of single-cell RNA-seq analysis, and as such […]

Some thoughts on my recent Twitter break

written 27 days ago by What You're Doing Is Rather Desperate by Neil Saunders

Various people have suggested that taking a break from social networks – Twitter in particular – can be A Good Thing™. So I tried it, for a couple of weeks. Here’s what I learned. 1. Why a break? The reasons that everyone else cites, I guess. A sense that my stream has swung away from … Continue reading Some thoughts on my recent Twitter break

Data science infrastructure for agricultural sustainability and food security

written 5 weeks ago by Blue Collar Bioinformatics by Brad Chapman

I’m excited to be joining Ginkgo Bioworks. Ginkgo and the synthetic biology community have an incredible amount of useful data in intricate experimental designs, measured screening outcomes and pre-existing biological knowledge. I’ll help organize, present, compute on and do science with this data. I hope to enable downstream applications that improve agricultural sustainability and food security. I’m motivated to help with building an increasingly fair and climate friendly agricultural system due to the dire warnings about the state of our planet. There are many different ways to contribute on mitigating climate change, from personal action to politics to your daily work; see Drawdown, Sliced and Mattermost for some comprehensive lists. Ginkgo and their downstream applications provided a unique opportunity to make use of my programming, genomics and synthetic biology background to work towards healthy and sustainable food production. The incredibly difficult part of this change is moving away from the amazing community of researchers I’ve had the privilege to work with. Our incredible team at Harvard Chan School. continues to teach me new things every day and makes supporting science a joy. Rewarding collaborations with AstraZeneca, the University of Melbourne Centre for Cancer Research and Veritas Genetics provide a continuous source of scientific challenges alongside the inspiration of seeing smart teams answer difficult questions. Data science infrastructure in synthetic biology At Ginkgo my initial focus will be on providing infrastructure and scientific support to enable application specific data science teams. The data challenges at Ginkgo are similar to those faced ...

Ensembl insights: Annotating readthrough transcription in Ensembl

written 5 weeks ago by Ensembl Blog

Ever come across a transcript that seems to span multiple genes? These are called ‘readthrough transcripts’, or sometimes ‘conjoined genes’, and they’re more common than you might think. Read on to find out about what they are and what they do, and how we annotate these at Ensembl. This article was written by Jonathan Mudge […]

Nonsense methods tend to produce nonsense results

written 5 weeks ago by Bits of DNA by Lior Pachter

Five years ago on this day, Nicolas Bray and I wrote a blog post on The network nonsense of Manolis Kellis in which we described the paper Feizi et al. 2013 from the Kellis lab as dishonest and fraudulent. Specifically, we explained that: “Feizi et al. have written a paper that appears to be about inference of edges in […]

Failing to Fetch An Interesting Result on Dog Oncogene Homologs

written 5 weeks ago by Omics! Omics! by Keith Robinson

An idea for a little exploration occurred to me back at Infinity -- that is 7.5 years ago -- that I've never tried out. But I never got around to it. I had some downtime recently to play around so I finally executed the experiment -- alas, it turns out not to be very interesting. Still, a negative result is a negative result.Read more »

Sub-Poisson loading for single-cell RNA-seq

written 5 weeks ago by Bits of DNA by Lior Pachter

The encapsulation of beads together with cells in droplets is the basis of microfluidic based single-cell RNA-seq technologies. Ideally droplets contain exactly one bead and one cell, however in practice the number of beads and cells in droplets is stochastic and encapsulation of cells in droplets produces an approximately Poisson distribution of number of cells […]
