Cool things the VEP can do: variant prioritisation with G2P

written 6 weeks ago by Ensembl Blog

A common use case for the VEP is as a first step towards identifying the causal genetic variant of a rare phenotype from whole genome/exome sequencing. The VEP tells you which genes are hit, what effects they have on them, and you have to begin the long laborious process of filtering those down. Things you […]

Expanding the Phenotype of Spinal Muscular Atrophy

written 8 weeks ago by KidsGenomics

One of the major challenges of rare disease genomics is the availability of patient samples. The disorders being studied are, by definition, rare in the human population, and the variants that cause them (usually) rarer still. When it comes to identifying and establishing truly novel disease genes — genes that have not yet been associated […] The post Expanding the Phenotype of Spinal Muscular Atrophy appeared first on KidsGenomics.

Mathematics matters

written 8 weeks ago by Bits of DNA by Lior Pachter

Six years ago I received an email from a colleague in the mathematics department at UC Berkeley asking me whether he should participate in a study that involved “collecting DNA from the brightest minds in the fields of theoretical physics and mathematics.” I later learned that the codename for the study was “Project Einstein“, an […]

New variant pathogenicity predictors

written 8 weeks ago by Ensembl Blog

Rating variants for their potential deleteriousness is vital for solving the link between genotypes and phenotypes. There are many different algorithms for predicting how likely it is that a human variant would affect the function of a protein, and in release 94 of Ensembl, we’ll be making more of these available. Currently, we have SIFT […]

4th Annual Training Course on Viral Bioinformatics and Genomics (20-24th Aug, 2018)

written 9 weeks ago by Bioinformatics I/O

From 20-24th Aug, we have held our 4th Viral Bioinformatics and Genomics training course at the Garscube campus of the University of Glasgow. In this year, our course had 16 participants attending from academic and health institutions from across the UK, Europe, Canada, Australia, Qatar, and South Africa. The MRC-University of Glasgow Centre for Virus […]

BioBloom tools: fast, accurate and memory-efficient host species sequence screening using bloom filters

written 9 weeks ago by Kevin's GATTACA World Justin Chu Sara Sadeghi Anthony Raymond Shaun D. Jackman Ka Ming NipRichard Mar Hamid Mohamadi Yaron S. Butterfield A. Gordon Robertson Inanç BirolBioinformatics, Volume 30, Issue 23, 1 December 2014, Pages 3402–3404, 20 August 2014 AbstractLarge datasets can be screened for sequences from a specific organism, quickly and with low memory requirements, by a data structure that supports time- and memory-efficient set membership queries. Bloom filters offer such queries but require that false positives be controlled. We present BioBloom Tools, a Bloom filter-based sequence-screening tool that is faster than BWA, Bowtie 2 (popular alignment algorithms) and FACS (a membership query algorithm). It delivers accuracies comparable with these tools, controls false positives and has low memory requirements.Availability and

Ensembl at AfSHG conference, Rwanda

written 9 weeks ago by Ensembl Blog

We’re excited to be trying a new conference this year: the African Society of Human Genetics (AfSHG) conference in collaboration with H3Africa, in Kigali Rwanda, 19th-21st September. The conference is a fantastic opportunity for African scientists to showcase their work, build collaborations and learn more about their field of research. For us, it’s great to see […]

Cool stuff the VEP can do: installation

written 10 weeks ago by Ensembl Blog

If you don’t want to analyse your variants on external servers or have more than 1000 or so to annotate, you probably want to use the VEP script. Setting it up might not always be straightforward as there are dependencies you need, but the installation script takes away a lot of the trouble. Running the […]

Whole-genome analysis for early infant epilepsy

written 10 weeks ago by KidsGenomics

Early infantile epileptic encephalopathy (EIEE) is a devastating syndrome of intractable seizures that strike in the first months of life. According to Orphanet, it affects 1 in 50,000-100,000 births. Infants with EIEE may suffer hundreds of tonic spasms per day, both during sleep and wakefulness. The prognosis is not good. Most patients die within two […] The post Whole-genome analysis for early infant epilepsy appeared first on KidsGenomics.

Abstract for SIAM: Supporting and Sustaining Open Source Software Development: the Commons Perspective

written 11 weeks ago by Living in an Ivory Basement by Titus Brown

How do we support and sustain open source software development?

A tutorial on t-SNE (1)

written 12 weeks ago by The Grand Locus

In this tutorial, I would like to explain the basic ideas behind t-distributed Stochastic Neighbor Embedding, better known as t-SNE. There are tons of excellent material out there explaining how t-SNE works. Here, I would like to focus on why it works and what makes t-SNE special among data visualization techniques. If you are not comfortable with formulas, you should still be able to understand this post, which is intended to be a gentle introduction to t-SNE. The next post will peek under the hood and delve into the mathematics and the technical detail. Dimensionality reduction One thing we all agree on is that we each have a unique personality. And yet it seems that five character traits are sufficient to sketch the psychological portrait of almost everyone. Surely, such portraits are incomplete, but they capture the most important features to describe someone. The so-called five factor model is a prime example of dimensionality reduction. It represents diverse and complex data with a handful of numbers. The reduced personality model can be used to compare different individuals, give a quick description of someone, find compatible personalities, predict possible behaviors etc. In many... Read more on the blog: A tutorial on t-SNE (1)

Ensembl insights: How are UTRs annotated?

written 3 months ago by Ensembl Blog

It’s probably reasonable to assume that the coding sequence (CDS) of a protein-coding transcript model is the feature that is of primary interest to most people who use Ensembl. However, both the 5’ and 3’ untranslated regions (UTRs) are important biological entities in their own right, and it is vital that we in Ensembl do […]

Can bits be the basis for a digital commons? (No.)

written 3 months ago by Living in an Ivory Basement by Titus Brown

Bits cannot be the basis for a digital commons, because they are not rivalrous.

Cogs of data in UniProt

written 3 months ago by Inside UniProt

UniProt's mission is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information. Have you ever wondered how long it takes for a new protein sequence to reach you? Or how long it would take for any feedback you send about an entry to become incorporated (at the earliest)? Let’s follow the journey of an imaginary Protein X!1) Protein X begins its UniProt life cycle when the sequence is imported into the database. As shown in the image below, our protein begins its UniProt life in the blue phase of 'Live Data', named thus because the UniProt production team is actively working on this data. Information can only be merged into protein entries in this blue phase. This phase runs for 4 weeks. For a newly imported sequence like Protein X, this phase consists of:Importing new/ updated proteins from INSDC, Ensembl, RefSeq, PDBe, direct submissions, etc.Creating a new UniProt entry for Protein X (or merging with an existing entry if identical)Adding cross-links to taxonomy information and the source of sequence2) Protein X now enters the yellow phase of 'Frozen data' for 4 weeks. The UniProt production team freezes the new data and makes it available to some internal/collaborating groups to access it and work on it as follows:InterPro: to assign Protein X into protein families, identify domains and functional sitesGene Ontology group: to classify functions into the gene ontologyUniProt curators: to potentially review the protein and annotate dataUniProt automatic annotation: to ...

Getting to know us: Irina from Variation

written 3 months ago by Ensembl Blog

Today we are meeting Irina, who joined the Variation team earlier this year. She talks about how she came to Ensembl, her interests, experience so far and more. What is your job in Ensembl? I am a Bioinformatician in the Ensembl Variation team and I work on the annotation and interpretation of DNA variants in […]

What’s coming in Ensembl 94 / Ensembl Genomes 41

written 3 months ago by Ensembl Blog

We’re planning to release the next Ensembl and Ensembl Genomes in September. We’ve got some exciting new genomes, including Emmer wheat, lots of fish and fungi. We’ve also got GENCODE updates for human and mouse, and new transcription factor binding motifs. New assemblies and gene annotation Updated genes: human, incorporating Ensembl automatic and Havana manual […]

"Labor" and "Engaged effort"

written 3 months ago by Living in an Ivory Basement by Titus Brown

Are "effort" and "labor" the same?

Ensembl Front-End Web Developer

written 3 months ago by Ensembl Blog

We’re looking for a web-developer to work on our new genome browser. We’re looking for masters in computer science or bioinformatics, with experience developing web interfaces using Javascript, HTML, CSS3, React, a scripting language and Github. Closes 16th September. Location: EMBL-EBI Hinxton near Cambridge, UK Staff Category: Staff member Contract Duration: 3 years Grading: 5 (monthly […]

Two Museums Guaranteed to Fluor You

written 3 months ago by Omics! Omics! by Keith Robinson

I've been horribly neglecting this space for an extended period. Contributors to that include a TNG eclosing from high school, ferrying grandparents, a milestone (or is it millstone?) birthday and a 10 day vacation with poor Internet service. Oh yeah, another one of those starts Thursday. Then there's keeping the genome factory going -- at times I feel like a worker in Fritz Lang's Metropolis. But someone even noticed and emailed me today whether this hiatus would end, which is beyond reason enough to get going. But tonight's entry has nothing really to do with biology or genomics, but rather hearkens back to the first science I fell for.Read more »

Cool stuff the VEP can do: custom annotation

written 3 months ago by Ensembl Blog

Ensembl produce high quality gene annotation for a number of species, but getting it to the high quality we expect takes time. This means there are many species and strains where we don’t have annotation yet. If you’re working with a species without Ensembl annotation (like Trixie the Triceratops here) or even a specific strain […]

Free online course – learn to use the Ensembl browser

written 3 months ago by Ensembl Blog

This September, we’re excited to announce the third iteration of our free webinar-based browser course. While our in-person workshops are the best way to learn about Ensembl, we know that not everyone can attend or organise one. If that’s you, then our webinar course is perfect for you. What’s a webinar course? Webinars are a […]

More Ensembl training options available

written 3 months ago by Ensembl Blog

You might know that we offer training courses on using the Ensembl browser, but did you know that we also offer Ensembl REST API and Ensembl Train the Trainer courses? We can come to you to deliver any of these courses at your institute and we don’t charge any fees. If you’re in a low-middle […]

The new Ensembl regulatory build for mouse

written 3 months ago by Ensembl Blog

You may have heard us squeaking about our new mouse regulatory build in our Ensembl 93 release blog. If you’re interested in finding out what exactly a ‘regulatory build’ is, and how to view and download this data in Ensembl, then this is the blog for you! What is the Ensembl regulatory build? The Ensembl regulatory build is […]

Sequencing or Array Testing for Genetic Diseases?

written 3 months ago by KidsGenomics

When a child is born with a suspected genetic condition, an increasing number of tools are available to the clinician: newborn screening panels, metabolic testing, cytogenetic testing, gene panels for certain conditions… and more recently, comprehensive sequencing of the exome (i.e. all protein-coding genes) or the genome. Yet chromosomal microarray (CMA) is often the frontline […] The post Sequencing or Array Testing for Genetic Diseases? appeared first on KidsGenomics.

Just use a scatterplot. Also, Sydney sprawls.

written 4 months ago by What You're Doing Is Rather Desperate by Neil Saunders

Sydney’s congestion at ‘tipping point’ blares the headline and to illustrate, an interactive chart with bars for city population densities, points for commute times and of course, dual-axes. Yuck. OK, I guess it does show that Sydney is one of three cities that are low density, but have comparable average commute times to higher-density cities. … Continue reading Just use a scatterplot. Also, Sydney sprawls.
