How long does it take to produce scientific software?

written 6 weeks ago by Living in an Ivory Basement by Titus Brown

How long does it take to produce scientific software?

Update to Perl and BioPerl in Ensembl

written 6 weeks ago by Ensembl Blog

From Ensembl 93 onwards, we plan to recommend newer versions of Perl (5.14- 5.26) and BioPerl (1.6.924) when using the Ensembl Perl API. This may affect pipelines which employ the Ensembl Perl API, since we will no longer actively support older versions of Perl and BioPerl. At Ensembl, we’ve been using Perl since the project […]

MicroRNA Gene Ontology annotations

written 6 weeks ago by miRBase blog

You might have noticed some additional information on the mature miRNA pages in the last few weeks. See for example: The new section “QuickGO function” contains a set of high quality manual annotations of Gene Ontology terms for mature miRNAs, the vast majority of which come from the work of Rachel Huntley et [...]

Gene Variant Image retirement for human, e93

written 6 weeks ago by Ensembl Blog

As of Ensembl release 93, which is due at the end of the month, the Gene Variant Image view will be retired for human. We have elected to retire this page because we feel that the density of known genetic variation is too great for this view to be informative in its current form. The […]

How accurate is the nanopore-only assembly of GM12878?

written 6 weeks ago by Opinionomics by Mick Watson

Yes, I still blog! A quick history. As many of you will be aware, Jain et al published a fantastic paper where they produced the first de novo genome assembly of a human genome using the MinION, a portable DNA sequencer that uses nanopores to detect the sequence of single molecules. After fixing errors with Illumina, […]

Low IQ scores predict excellence in data science

written 6 weeks ago by Bits of DNA by Lior Pachter

Here are two IQ test questions for you: Fill in the blank in the sequence 1, 4, 9, 16, 25, __ , 49, 64, 81. What number comes next in the sequence 1, 1, 2, 3, 5, 8, 13, .. ? Please stop and think about these questions before proceeding. Spoiler alert: the blog post […]

Extensive but not comprehensive compilation of de-novo assemblers

written 6 weeks ago by Bioinformatics I/O

This figure is an update of Figure 1 in “A practical comparison of de novo genome assembly software tools for next-generation sequencing technologies.” published by Zhang et al (2011). The figure was produced in SVG so you should be able to click on the name of the assembler which should take you straight to the […]

Detecting microbial contamination in long-read assemblies (from known microbes)

written 7 weeks ago by Living in an Ivory Basement by Titus Brown

Using sourmash to find candidate contaminants

Training in low-middle income countries

written 7 weeks ago by Ensembl Blog

We’re excited to announce that we have received a grant from the Wellcome Trust to deliver Ensembl training in low-middle income countries at no charge. We know that there’s a real need for training in low-middle income countries. A quick glance at data available in the GWAS catalog and gnomAD show that non-European ethnicities are under-represented […]

Bioinformatician – Ensembl Plants

written 7 weeks ago by Ensembl Blog

We’re looking for a bioinformatician to work on integrating, analysing and testing data for the Ensembl Plants database. We’re looking for experience delivering a service in bioinformatics, with knowledge of relational databases and programming. Closes 8th July 2018. Location: EMBL-EBI Hinxton near Cambridge, UK Staff Category: Staff Member Contract Duration: 3 years Grading: 5 (monthly […]

What’s coming in Ensembl 93 and Ensembl Genomes 40

written 7 weeks ago by Ensembl Blog

Both Ensembl release 93 and Ensembl Genomes release 40 are scheduled for late June and early July 2018, respectively. Included are a number of new genomes and genebuilds for vertebrates and plants (including leopard, Amur tiger, hagfish, pigeon pea, carrot and adzuki bean) and significant updates to the mouse GENCODE annotation and regulatory build. This […]

The benefits of multiplexing

written 7 weeks ago by Bits of DNA by Lior Pachter

Earlier this month I posted a new paper on the bioRxiv: Jase Gehring, Jeff Park, Sisi Chen, Matt Thomson, and Lior Pachter, Highly Multiplexed Single-Cell RNA-seq for Defining Cell Population and Transciptional Spaces, bioRxiv, 2018. The paper offers some insights into the benefits of multiplex single-cell RNA-Seq, a molecular implementation of information multiplexing. The paper also […]

Communicating outside of big consortia is tough! (but important!)

written 7 weeks ago by Living in an Ivory Basement by Titus Brown

It's hard enough to keep people inside informed...

Cool stuff the VEP can do: optimisation

written 7 weeks ago by Ensembl Blog

Some Variant Effect Predictor (VEP) jobs are small, just ten or fewer variants, and that’s easy. Some VEP jobs are big, if you do variant calling on one whole human genome, that’s five million variants! The more variants you have, the more computing power the VEP needs to process them, which can make it slow. […]

Open-source style community engagement for the Data Commons Pilot Phase Consortium

written 7 weeks ago by Living in an Ivory Basement by Titus Brown

Keeping the Data Commons community coordinated and engaged

PubMed retractions report has moved

written 8 weeks ago by What You're Doing Is Rather Desperate by Neil Saunders

A brief message for anyone who uses my PubMed retractions report. It’s no longer available at RPubs; instead, you will find it here at Github. Github pages hosting is great, once you figure out that docs/ corresponds to your web root :) Now I really must update the code and try to make it more … Continue reading PubMed retractions report has moved

Miscellaneous &amp; Disorderly Thoughts on the Eve of London Calling

written 8 weeks ago by Omics! Omics! by Keith Robinson

It's the night before London Calling. I hope to post Thursday, but an after-meeting report won’t be until nest week - I must dash on Friday fir a slightly insane/exhilarating routing to meet my family in Florida for the holiday weekend. Exhilarating as I will have a layover in one of the ancient capitals of Europe, Lisbon, which I’ve never visited. Insane, because it’s a 12 hour overnight layover. Anyway, between the challenge of covering Oxford Nanopore's expanding reach of products and applications and being sleep-addled from taking the redeye flight I'm going to throw out a bunch of thoughts without really trying to fuse them into a coherent narrative.Read more »

Should PentaSaturn Buy An iSeq: A Hypothetical Scenario Illustrating Platform Picking

written 8 weeks ago by Omics! Omics! by Keith Robinson

Editorial note: I wrote this in early January, then planned to slot it in after some other items. Then life knocked me upside the head, then AGBT came along and then it was forgotten. Once I remember it, I fretted it had gone stale. But I had put a lot of effort into it and really nothing has changed with regard to iSeq, other than it should be shipping now. Besides, this week is London Calling and so having an Illumina-centric piece could be a bit of useful balance. So, for your consideration:Some of the online discussion around this January's iSeq announcement, springing from my piece or elsewhere, explores how the iSeq fits into the sequencing landscape. In particular, how does it fit in with Illumina's existing MiniSeq and MiSeq and how does it go against Oxford Nanopore's MinION. For example, in Matthew Herper's Forbes piece, genomics maven Elaine Mardis compares iSeq unfavorably to MiSeq in terms of cost-per-basepair. I'm a huge believer in fitting sequencing to ones scientific and practical realities and not the other way 'round: no one platform quite fits all situations nor do even the same metrics fit all situations. So in this piece, I'm going to illustrate what I believe is a plausible scenario in which iSeq would make sense. Now, I have designed this to play to iSeq's characteristics and very realistically have many dials which I could turn to go in another direction. Which I will try to note as I go along.Read ...

James Watson in his own words

written 9 weeks ago by Bits of DNA by Lior Pachter

“Some anti-Semitism is justified” “Whenever you interview fat people, you feel bad, because you know you’re not going to hire them” “Japan should be bombed for dragging its feet on supporting the Human Genome Project” “All our social policies are based on the fact that [Africans] intelligence is the same as ours – whereas all […]

get the peaks that shared in multiple samples

written 9 weeks ago by Diving into Genetics and Genomics

Getting to know us: Paul Kersey, Ensembl Genomes Team Leader

written 10 weeks ago by Ensembl Blog

This month we’re meeting Paul Kersey, who is the Ensembl Genomes team leader. What is your job in Ensembl? I’m the leader of the Non-Vertebrate genomics team. That means I’m the “front man” for Ensembl Genomes – our services for bacteria, protists, fungi, plants and invertebrate metazoa – although of course these activities depend on […]

50% bananas

written 10 weeks ago by What You're Doing Is Rather Desperate by Neil Saunders

Today in “blog posts that have spent two years in the draft folder” – “Humans are 50% banana.” “Humans are 50% banana.” Perhaps you have heard this statement, or one like it. It seems to be widely-quoted. As an example it’s hard to go past this article from UK tabloid The Mirror which, in addition … Continue reading 50% bananas

Learning from our closest living relatives

written 10 weeks ago by Inside UniProt

In the March 2018 release, UniProt added the proteins of 10 new primate species to our collection of complete proteomes; the set of proteins believed to be expressed by an organism and typically obtained from the translation of a fully sequenced, annotated genome. As of release 2018_04 the total number of primate complete proteomes in UniProtKB stands at 24, a set which of course includes that of human. So why is this important?Studying non-human primates enables scientists to understand the evolution of genomic change and how this has impacted on protein expression patterns. Comparisons across proteome sets allow us to understand which proteins are shared by all primates, and which are species specific, and how changes in protein expression patterns have affected our evolutionary development. For example, multiple copies of a domain of unknown function, the recently renamed Olduvai domain (IPR010630), are found in neuroblastoma breakpoint family (NBPF) proteins. The copy number is highest in humans, lower in African great apes and further reduced in Orangutan and Old World monkeys. It has been speculated that this may be directly related to the size of the primate’s brain, more specifically to the volume of the neocortex, and links between domain copy number and both cognitive function and the severity of autism have been identified. Another study looked at why gene duplication has led to a marked expansion of HLA proteins in macaque monkeys in comparison to other primates. The HLA cell-surface proteins are responsible for the regulation of the immune system, ...

Rare Disease Research Collaboration: RLIM in X-linked Intellectual Disability

written 10 weeks ago by KidsGenomics

Access to samples is one of the main challenges of studying rare inherited conditions, particularly those whose molecular basis is unknown. Every individual harbors 3-5 million genetic variants compared to the human reference sequence. Even if we winnow these down to only rare variants likely to disrupt a protein-coding gene, hundreds still remain. The Power […] The post Rare Disease Research Collaboration: RLIM in X-linked Intellectual Disability appeared first on KidsGenomics.

PromethION Racing: A Call To The Post

written 11 weeks ago by Omics! Omics! by Keith Robinson

I was at a get-together yesterday for bioinformatics folks associated with Third Rock Ventures companies at a local pub. The organizer, who I've known for a number of years, was introducing me with the pleasant "Keith writes a nice blog" -- but then the barb "but he hasn't posted in a while". Ouch! But it hurts because it's true; too many excuses to not write and far too many half-baked ideas and interviews that should be out (or worse, a nearly complete post). Since it is May, which in the U.S. is bookended by iconic racing events, I'd like to trot out an idea that has been idling for a while: PromethION Racing.Read more »
