ATACseq contamination of mycoplasma DNA

written 29 days ago by Diving into Genetics and Genomics

PacBio Outlook 2018

written 5 weeks ago by Omics! Omics! by Keith Robinson

Well, I didn't exactly get my Pacific Biosciences preview out before their J.P. Morgan presentation. Luckily, the slides for that primarily projected financials and touted their successes -- and didn't drop any major platform announcements in -- so I didn't miss out. PacBio's position is important and worth reviewing, even if it doesn't change much.Read more »

Terry Speed: a “male feminist”

written 5 weeks ago by Bits of DNA by Lior Pachter

On April 11th 2016, I contacted the Office for Prevention of Harassment and Discrimination at UC Berkeley to report that Professor Terry Speed had sexually harassed a postdoctoral researcher in the UC Berkeley statistics department in the period 2000–2002. Two specific allegations were subsequently investigated: Allegation One: Respondent, a professor in the Statistics Department, sexually […]

Ensembl Developer – Genebuild

written 5 weeks ago by Ensembl Blog

We’re looking for a bioinformatics developer to join our genebuild team, creating and running pipelines to annotate genes onto genomes. We’re looking for PhDs or MScs in Computer Science, Bioinformatics, Genetics or related fields, with experience in genome annotation using pipelines and compute clusters and knowledge of object-oriented programming and Unix. Closes 25th February. Location: […]

Getting to know us: Dan from Production

written 5 weeks ago by Ensembl Blog

This month we’re meeting Dan Staines, who is part of the Ensembl Production team. What is your job in Ensembl? I lead the Production team, who look after all the processes and pipelines needed to transform the data provided by the other Continue reading Getting to know us: Dan from Production→

AGBT 2018 Schedule is Announced

written 6 weeks ago by Next Gen Seek

Advances in Genome Biology and Technology (AGBT) General Meeting, one of the top genomics conferences has announced the full agenda for the meeting this year. AGBT this year is from 12th February to 15th February in Orlando, Florida. The full schedule looks exciting with lots of interesting talks. Here are some of the interesting talks to […]

Ensembl Genomes 38 is now live

written 6 weeks ago by Ensembl Blog

We’re pleased to announce the latest release from Ensembl Genomes. There’s new and updated genome assemblies available for lots of plant and non-vertebrate metazoan species now available. Find out more: Metazoa Genome assembly and annotation for the buff-tailed bumble bee (Bombus Continue reading Ensembl Genomes 38 is now live→

RECOMB 2018 Accepted Papers

written 6 weeks ago by Next Gen Seek

The 22nd annual RECOMB 2018, one of the oldest conferences focusing on computational, mathematical, statistical and biological sciences has announced the list of accepted papers that will be presented as talks at the conference. This year’s RECOMB will be held in the picturesque Paris, France from 21st April through 24th April at Campus Pierre Et Marie Curie, Universite` […]

Some resources for the data science ecosystem

written 6 weeks ago by Living in an Ivory Basement by Titus Brown

Some good starting points on data science, IMO.


written 6 weeks ago by Omics! Omics! by Keith Robinson

Illumina CEO Francis deSouza's J.P. Morgan Presentation did not disappoint. While humdrum financials and touting market dominance and areas of future growth came first, then came the big Firefly announcement (with a name change to iSeq 100) -- and then after another short spell of reviewing the latest Nextera chemistry came a smaller bombshell -- Illumina is partnering with former arch-rival Thermo Fisher (nee Life Technologies nee Applied Biosystems) to move the AmpliSeq multiplex PCR technology over to the Illumina platform.Read more »

Illumina launches iSeq 100 for $20k at JPM 2018

written 6 weeks ago by Next Gen Seek

Illumina launched a new sequencer iSeq 100 at this year’s JPM. iSeq 100 can sequence 4 million reads per run, can sequence 2x 150bp with maximum throughput of 1.5Gb per run. iSeq 100 is the smallest desktop sequencer in Illumina’s portfolio and it is primarily focused for providing sequencing solutions to smaller projects, like small […]

Some interview questions for a job building data analysis pipelines

written 6 weeks ago by Living in an Ivory Basement by Titus Brown

Some interview questions we recently used.

Illumina Outlook II: The Fleet

written 6 weeks ago by Omics! Omics! by Keith Robinson

In my prior installment I looked at Firefly, now clearly a working instrument. Now I'll take a peak at the rest of the Illumina fleet.Read more »

Bioinformatician – VectorBase

written 6 weeks ago by Ensembl Blog

We’re looking for a bioinformatician to work on annotation and integration of invertebrate vectors of human pathogens for VectorBase. We’re looking for degrees in computer science or bioinformatics with experience working with NGS data, Perl and MySQL. Closes 8th February. Location: EMBL-EBI Hinxton near Cambridge, UK Staff Category: Staff member Contract Duration: 1 year and […]

Reassessing the ‘Digital Commons’

written 6 weeks ago by Living in an Ivory Basement by Titus Brown

Another critical take on the so-called ‘Digital Commons’.

Illumina 2018 Preview I: Firefly

written 7 weeks ago by Omics! Omics! by Keith Robinson

Time to start gazing into my cloudy liquid crystal ball and attempt to see what will happen in the sequencing world in 2018. J.P. Morgan is next week, which puts a time box on getting predictions out. One thing I see on both my personal and blogging horizon are flying creatures bearing light. On the local front, TNG has decided to head this fall to the City of Brotherly Love to learn to fly and breath flame. But in the sequencing world -- well, I'm going to need to pack a huge Ball jar for my trip to AGBT this year, as I plan to hunt out a Firefly.Read more »

Remembering 2017's Losses

written 7 weeks ago by Omics! Omics! by Keith Robinson

A new year beckons and with it a burst of enthusiasm for writing. Which also means combing through post ideas from last year that never quite were completed -- some as stubs or at least headlines. But before tackling the new, I feel I need to tackle some personal losses in 2017.Read more »

cancer gene census copy number

written 7 weeks ago by Diving into Genetics and Genomics

Ming TangJanuary 2, 2018Set upknitr will force changing current working directory: use the ezkintr package here() starts at /Users/mtang1/projects/mixing_histology_lung_cancerroot.dir&lt;- here()opts_knit$set(root.dir = root.dir)read in the dataI am going to check the COSMIC database on cancer genes. Specifically, I want to know which cancer-related genes are found amplified and which are deleted.The data can be downloaded from Loading tidyverse: ggplot2## Loading tidyverse: tibble## Loading tidyverse: tidyr## Loading tidyverse: readr## Loading tidyverse: purrr## Loading tidyverse: dplyr## Conflicts with tidy packages ----------------------------------------------## filter(): dplyr, stats## lag(): dplyr, statslibrary(janitor)library(stringr)cancer_gene_census&lt;- read_csv("data/COSMIC_Cancer_gene_Census/cancer_gene_census.csv", col_names = T)## Parsed with column specification:## cols(## .default = col_character(),## `Entrez GeneId` = col_integer(),## Tier = col_integer()## )## See spec(...) for full column specifications.tidy the data with janitor::clean_names(), dplyr::unnestThe column names have spaces, and in many columns there are mulitple strings separated by ,.## dplyr after 0.7.0 use pull to get one column out as a vector, I was using .$# %&gt;% pull(`Gene Symbol`) %&gt;% unique()#cancer_gene_census %&gt;% distinct(`Gene Symbol`)## extract genes that are amplified or deleted in cancers. in the `mutation types` column search A for amplificaton and D for large deletion.## use janitor::clean_names() to clean the column names with space.cancer_gene_census %&gt;% clean_names() %&gt;% mutate(mutation_type =strsplit(as.character(mutation_types), ",")) %&gt;% unnest(mutation_type) %&gt;% tabyl(mutation_type)## mutation_type n percent valid_percent## 1 N 1 0.0008257638 0.0008264463## 2 A 8 0.0066061107 0.0066115702## 3 D 8 0.0066061107 0.0066115702## 4 F 139 0.1147811726 0.1148760331## 5 Mis 68 0.0561519405 0.0561983471## 6 N 147 0.1213872832 0.1214876033## 7 O 37 0.0305532618 0.0305785124## 8 S 76 0.0627580512 0.0628099174## 9 T 19 0.0156895128 0.0157024793## ...

The End of 2017

written 7 weeks ago by Diving into Genetics and Genomics

In the end of last year, I wrote a post summarizing the passing 2016. Now, it is time to write the same for 2017! How time flies! Last year, I wrote:For the coming 2017, I should be:1. busy with Phoebe.2. writing 1-2 papers.3. writing a book chapter on ChIP-seq for the biostar handbook. It should come out in the mid of 2017.3. writing a small R package for practice.4. learning a piano song.Looking back, it seems I only accomplished 1 and 3 :) I do have two first-author papers in writing-stage, but I have not finished them yet. I wish I could get them out in 2018.The book chapter on ChIP-seq was published here. If you want a PDF of my chapter, I can email you, just reply in the comments.I still have not got a chance to write an R package, which is on my list for long. The coming 2018 is a good time for me to get my hands wet. Our epigenomic project was selected by the Data Science Road-Trip program !! I received the confirmation in the end of 2017. I look forward to learn more R and machine learning for 2 weeks. And the plan is to turn the work into an R package. Many thanks to @JasonWilliamsNY, I saw this opportunity from his tweet. The development of R package is becoming easier with hadley wickham's work usethis and many others.I failed #4 totally...I did not get any time to practice the piano. My wife ...

Experiences with the first edition of “Introduction to Computational Modelling for the Biosciences”

written 9 weeks ago by In between lines of code by Lex Nederbragt

Earlier this year, I wrote a post about the new first-semester bachelor course “Introduction to Computational Modelling for the Biosciences” ​at out institute. A quick summary: from 2017, the Biosciences bachelor study program will incorporate Computing in Science Education (CSE) into the different subjects a new course “Introduction to Computational Modelling for the Biosciences” will […]

Merge Enhancer promoter interaction bedpe files and recursive function in R

written 9 weeks ago by Diving into Genetics and Genomics

Ensembl service disruption

written 9 weeks ago by Ensembl Blog

We apologise for the ongoing problems with the Hinxton based web sites (,, and REST servers ( and We are working to resolve the problem and will restore normal service as soon as possible. The website mirrors Continue reading Ensembl service disruption→

Learn to use the Ensembl Perl API

written 9 weeks ago by Ensembl Blog

We’re holding an Ensembl Perl API course at the Genome Campus in the UK in April. The course give you chance to learn how to access the database directly from the people who produce the databases and write the APIs themselves. Continue reading Learn to use the Ensembl Perl API→

Ensembl 91 has been released!

written 10 weeks ago by Ensembl Blog

Ensembl 91 is now live! The cat is truly out of the bag now, and we can safely say that there’s been no monkeying around this release; we’ve been very busy! Read on to discover the highlights of this new Continue reading Ensembl 91 has been released!→

2017 Nanopore Community Meeting: An Incomplete Summary

written 10 weeks ago by Omics! Omics! by Keith Robinson

The 2017 Nanopore Community Meeting was over a week ago back in New York City, so I'm grossly overdue in cobbling together some observations and opinion based on the tweet stream (I had a critical day job meeting at the same time and wasn't in New York). I did dash off the bit about SmidgION being potentially like the early Macs (though I got wrong the nomenclature, the original was the Mac 128K -- Mac Classic was a later model that resembled it). Oxford also deviated this autumn from the pattern of public information they had seemingly established, with major news at London Calling and smaller updates at the community meeting but also a pair of Clive Brown webcasts each falling roughly halfway between the two meetings. This fall, no webcast.Nanopore's have their own Day 1 and Day 2 writeups and an independent write-up from Arwyn Edwards.PlatformPer the usual pattern, Oxford showed off previously announced hardware but made no solid announcements. I've put together a Storify of relevant tweets which may hold further information.Flongle/SmidgIONSmidgION pumping out data with an attached Android phone calling the bases was a heavily tweeted and retweeted photo. Alas, Oxford apparently put release of the SmidgION/Flongle components into the second half of next year, so no SmidgIONs adorning Christmas trees this year while happy recipients sing Flongle Bells ("Oh what fun, it is to sequence, in a one horse open sleigh, hey!").Anxiously waiting for these little bad boys to develop! SmidgION for sequencing w cell phone ...
