Job: Software Developer

written 1 day ago by Ensembl Blog

We’re looking for a Software Developer to contribute to our variation resources. We’re looking for extensive programming experience and years’ experience working in a production development environment. Closes 19th August. Location: EMBL-EBI, Hinxton near Cambridge, UK Staff Category: Staff Member Contract Duration: 3 years (renewable) Grading: Grade 5 or 6 (starting at £2,738 – £3,063 […]

Improved workflows-as-applications: tips and tricks for building applications on top of snakemake

written 2 days ago by Living in an Ivory Basement by Titus Brown

Writing applications around workflow systems, take 2.

UniProt COVID-19 portal: Supporting research during the pandemic

written 3 days ago by Inside UniProt

Responding to the urgency of the pandemic, UniProt created and is continuing to develop a dedicated portal to provide access to the latest pre-release annotations and sequences for proteins related to COVID-19. It is released independently of UniProt’s 8 weekly release schedule. It can be accessed via and all sequences can also be downloaded directly via our FTP site integrated source of sequence, function and links to specialist resources The portal provides SARS-CoV-2 annotated protein sequences, closest SARS-CoV 2003 sequences and human sequences relevant to the biology of viral infection. The SARS-CoV-2 proteome is annotated based on expert curation of literature and the knowledge extracted from the well-studied SARS-CoV virus. Rule-based automatic annotation also allows us to add information from a broader taxonomic range of viruses. Links to structures, drugs, interactions, molecular pathways as well as many other resources provide integrated information to help understand the biology and investigate routes to treatment. The annotated UniProtKB entries include functional and positional annotations. The microbial infection information and essential positions and structures for the virus infection are also documented in these records. Each protein entry provides annotations such as the catalytic activity and function, Gene Ontology terms, 3D structures, interactions, external links to resources like IntAct, ChEMBL, DrugBank, PDBe-KB, etc, and the ProtVista visualisation of positional annotations on the sequence space. Within entries, the mature products that result from proteolytic cleavage of precursor proteins can be identified with UniProt product identifiers.Contribute and explore literature about COVID-19The portal provides access to ...

$ How to profit from COVID-19 testing $

written 7 days ago by Bits of DNA by Lior Pachter

Rapid testing has been a powerful tool to control COVID-19 outbreaks around the world (see Iceland, Germany, …). While many countries support testing through government sponsored healthcare infrastructure, in the United States COVID-19 testing has largely been organized and provided by for-profit businesses. While financial incentives coupled with social commitment have motivated many scientists and […]

Job: Bioinformatics Developer

written 9 days ago by Ensembl Blog

We’re looking for a Bioinformatics Developer to help expand comparative our genomics resources. We’re looking for MSc/PhD or equivalent in computational biology or bioinformatics and practical experience working with genomic data. Closes 28th August. Location: EMBL-EBI, Hinxton near Cambridge, UK Staff Category: Staff Member Contract Duration: 3 years (renewable) Grading: Grade 5 or 6 (starting […]

Cool stuff the Ensembl VEP can do: summarise your analysis

written 14 days ago by Ensembl Blog

Ensembl VEP analyses your variant alleles in detail using a flexible choice of options, but it can also create simple summary tables and graphics describing your full variant set. On the VEP results web page, we display three sets of summary information. The table of counts tells you how many variants were annotated and if […]

Two Pandemic-Related Programming Problems

written 14 days ago by Omics! Omics! by Keith Robinson

I will offer here two bioinformatics programming problems which I think are interesting, useful and should be approachable by an advanced undergraduate. For a variety of reasons I've been thinking a lot of about skill levels and how to assess them. One key reason is we have two open slots in our group, so I'm plowing through CVs and engaging in the usual hiring funnel struggle -- how do you winnow CVs to phone screens and then down to interviews? We also thought we might, but now won't, bring on a one year intern. But I'm also trying to take a look at my own skill set with a critical eye. Plus I maintain a Quora addiction, and you see there people looking for ways to prove their computational biology chops.Read more »

Job: Genome Annotation Project Leader

written 24 days ago by Ensembl Blog

We’re looking for a Project Leader to manage our activities in annotating thousands of genomes across the eukaryotic tree of life. We’re looking for PhD or equivalent in Molecular Biology, Bioinformatics or related field with experience in gene annotation. Closes 12th August. Location: EMBL-EBI, Hinxton near Cambridge, UK Staff Category: Staff Member Contract Duration: 3 […]

Virtual London Calling, Veritably Late, Part II: Platform Development

written 5 weeks ago by Omics! Omics! by Keith Robinson

Last time, I covered Oxford Nanopore LamPORE COVID-19 detection scheme. London Calling was over a week ago, so the chance to scribble before its all old news is rapidly shrinking. As noted yesterday, Clive Brown didn't speak here but instead will broadcast at some future date; it was left to his top technical lieutenants to cover the developments in the platform which have happened since the Community Meeting in New York back in early December. I've tried to hit the highlights here, but don't claim to be comprehensive.Read more »

Virtual London Calling, Veritably Late Copy: Part I, LamPORE

written 6 weeks ago by Omics! Omics! by Keith Robinson

London Calling was last week, held online due to the pandemic. My plans to attend in person were one of a myriad of travel arrangements upended by the calamity, though that is utterly trivial in comparison to the tragedy of so many lost lives, damaged survivors and economic ruin. Attending remotely also made it harder to ignore my work duties, which are at a crescendo (well, not really: it's been this intense for months). But all the talks are available online, so I have stolen some time to review the Oxford Nanopore technology announcements. There wasn't a Clive Brown talk; apparently he will deliver a broadcast later this summer to tease us with more crazy ideas emerging from the ONT Skunk Works.Read more »

Ensembl Rapid Release

written 6 weeks ago by Ensembl Blog

We are excited to announce the launch of the Ensembl Rapid Release website. Ensembl Rapid Release is a new, lightweight genome browser designed to allow quick release of the latest genome annotation for a large number of vertebrate and non-vertebrate species. Advancements in new sequencing technologies means that genome sequencing and assembly is faster and cheaper […]

What’s coming up in Ensembl 101 / Ensembl Genomes 48?

written 8 weeks ago by Ensembl Blog

Ensembl 101 (and Ensembl Genomes 48) are due out at the end of June 2020. As with all releases, we cannot guarantee that anything listed here will make it into the final release. Major Data Updates Update of human gene set to GENCODE 35 New population frequency data from the Gambian Genome Variation Project New […]

Pre 2017 archive websites unavailable next week

written 8 weeks ago by Ensembl Blog

Some of our archive websites will be unavailable on Mon 15th June 2020 for necessary maintenance. The work will commence at 0900 UTC and is expected to last for a few hours. The sites that will be affected are the December 2016 site (Ensembl 87) and earlier. All later archives and other Ensembl resources will […]

Black lives matter

written 8 weeks ago by Bits of DNA by Lior Pachter

Today, June 10th 2020, black academic scientists are holding a strike in solidarity with Black Lives Matter protests. I strike with them and for them. This is why: I began to understand the enormity of racism against blacks thirty five years ago when I was 12 years old. A single event, in which I witnessed […]

Ensembl under lockdown – Part 4

written 8 weeks ago by Ensembl Blog

Whilst the UK has started to lift some of the COVID-19 lockdown restrictions, Ensembl staff are still working from home. We are fortunate that we can continue to do this and that there are flexible working arrangements in place to support us. In this blog, Nishadi and Guy reflect on the last three months. They […]

Roche Expands Sequencing Nanopore Presence by Acquiring Stratos Genomics

written 9 weeks ago by Omics! Omics! by Keith Robinson

Ugh. I let the month of April slip away without writing and now have almost let May do the same. But some leftover euphoria from a huge experimental breakthrough on our current diagnostics project at the Gene Factory plus the feeling I shouldn't let news tied into an earlier post slip off, and here I am. When I wrote about sequencer startups back in February based on their websites, I put Stratos Genomics near the front of the pack. Roche Molecular apparently agrees, announcing a week ago that they are acquiring Stratos.Read more »

Ensembl under lockdown – Part 3

written 10 weeks ago by Ensembl Blog

The start of the lockdown due to the COVID-19 pandemic and the changes it brought has affected all of us in different ways. In this blog, we hear from Michal and Beth. Michal was on holiday in Brazil when travel restrictions, the UK lockdown and compulsory work from home for EMBL-EBI staff started. His work […]

Normalising variants to standardise Ensembl VEP output

written 10 weeks ago by Ensembl Blog

Variants can be represented in myriad different ways; indeed, Ensembl VEP currently supports input in many different formats, including VCF, HGVS and SPDI. However, even within these specifications, variants can be described ambiguously. Insertions and deletions within repeated regions can be described at multiple different locations. For example, VCF describes variants using their most 5’ […]

Ensembl under lockdown – Part 2

written 11 weeks ago by Ensembl Blog

It’s now ten weeks since we have started working remotely because of the COVID-19 pandemic. The novelty of the situation is gone and we got accustomed to the ‘new normal’. We have worked on ongoing projects as well as new ones: We have released Ensembl 100 in April and a new COVID-19 resource earlier this […]

Ensembl launches COVID-19 resource

written 11 weeks ago by Ensembl Blog

Today, Ensembl has joined the international scientific effort to tackle the COVID-19 pandemic. COVID-19 is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which has spread rapidly since emerging in late 2019. Our SARS-CoV-2 genome browser and related resources at are intended to support both basic research and ongoing work to develop […]

Ensembl under lockdown – Part 1

written 12 weeks ago by Ensembl Blog

Ensembl staff have been working from home due to the COVID-19 pandemic for eight weeks now. We transitioned to remote work from 16 March – two days before the closure of EMBL-EBI and a week before the Prime Minister announced the UK lockdown on 23 March. The transition to working from home has been relatively […]

sourmash databases as zip files, in sourmash v3.3.0

written 3 months ago by Living in an Ivory Basement by Titus Brown

The feature that I'm most excited about in sourmash 3.3.0 is the ability to directly use compressed SBT search databases. Previously, if you wanted to search (say) 100,000 genomes from GenBank, you'd have to download a several GB .tar.gz file, and then uncompress it out to ~20 GB before searching it. The time and disk space requirements for this were major barriers for teaching and use. In v3.3.0, Luiz Irber fixed this by, first, releasing the niffler Rust library with Pierre Marijon, to read and write compressed files; second, replacing our old khmer Bloom filter nodegraph with a Rust implementation (sourmash PR #799); and, third, adding direct zip file storage (sourmash #648). So, as of the latest release, you can do the following: # install sourmash v3.3.0 conda create -y -n sourmash-demo \ -c conda-forge -c bioconda sourmash=3.3.0 # activate environment conda activate sourmash-demo # download the 25k GTDB release89 guide database (~1.4 GB) curl -L &gt; # grab a genome signature - here, download a demo one from OSF curl -L &gt; genome.sig # search! sourmash search genome.sig This takes less than 2 GB of disk space total (including conda env), and the search runs in about 3 seconds and 120 MB of RAM. Using the zip file stuff alone is a slight speed drag (~10-20%?), but the shift to Rust leads to an overall speed increase of about 4x. And you can always unpack the zip file and use the unpacked files directly. Yay! ...

sourmash databases as zip files, in sourmash v3.3.0

written 3 months ago by Living in an Ivory Basement by Titus Brown

Use compressed databases directly!

Ensembl 100 has been released!

written 3 months ago by Ensembl Blog

We are very excited to announce the release of Ensembl 100, along with Ensembl Genomes 47! Time has really flown for us. We moved from our beginnings as a browser with just one genome 20 years ago to an integrated resource for many species and data types in 2020. In this release we continue to […]

Cool stuff the Ensembl VEP can do: variant citations

written 3 months ago by Ensembl Blog

If you are filtering a set of variants to look for those potentially involved in disease, your first stop will probably be databases of phenotype associations, like ClinVar. There is also a lot of valuable information on variant-disease associations in the literature, which may not yet have been extracted into curated databases. It can be […]
