IonCRAM: a reference-based compression tool for ion torrent sequence files

written 2 days ago by Kevin's GATTACA World

IonCRAM: a reference-based compression tool for ion torrent sequence files IonCRAM, the first reference-based compression tool to compress Ion Torrent BAM files for long term archiving. For the BAM files, IonCRAM could achieve a space saving of about 43%. This space saving is superior to what achieved with the CRAM format by about 8–9%.Future research for reducing the space consumption of the Ion Torrent BAM files would include the binning of the flow signal and quality values. The idea of binning was initially introduced by Illumina [27] to reduce the space consumption of the quality values. This initiative was immediately followed by intensive research to optimize the binning procedure and address its effect on the downstream analysis, especially on the variant calling step [28–31]. We think that the binning of flow signals and quality data of Ion Torrent would also be successful, provided that the manufacturer contribute to this research. We added an option to IonCRAM for binning the flow signals, in a similar way to the binning method implemented in [26], and measured its effect on compression (Supplementary File 1). We left the step for investigating the effect of this binning on the downstream analysis to further research.It is worth mentioning that IonCRAM has not been only used for the test data in the paper, it has also been used to compress and backup thousands of files for the Saudi Human Genome Program. IonCRAM is an open source and it is available for free along with the related ...

Static ip on Jessie Raspbian with dhcpcd.conf

written 2 days ago by Kevin's GATTACA World

Static IP address templates for dhcpcd.conf TEMPLATE: A static IP address only when no DHCP## The profile name is arbitrary. Use "fred"# if you want. Not much we can put as# default servers, but set them up as# you usually would.######################################################interface eth0fallback nodhcpprofile nodhcpstatic ip_address= routers= domain_name_servers= to do this

Future of Genomics: 10 bold predictions

written 2 days ago by Kevin's GATTACA World

Curious about research priorities and opportunities for human genomics for the next decade? You should read on. The National Human Genome Research Institute (NHGRI) this week published its “Strategic vision for improving human health at The Forefront of Genomics” in the journal Nature. The strategic vision culminates with 10 bold predictions for human genomics by 2030. Crafted to be both inspirational and aspirational, the predictions are intended to provoke thoughtful discussions (and even debate) about what might be possible in the coming decade. I must say that I expected people to store their encrypted genome sequences on smartphones anduse 5G networks to launch on the fly analyses a few years back. Not sure if we will have that by 2030! it's also a good time to review the 5 years prediction from 2015 Read about gold genomes and platinum genomes... The 2020 NHGRI Strategic Vision is available online at

Get training, virtually

written 10 days ago by Ensembl Blog

You’re invited to join us for a free, open, virtual training course, 30th November – 2nd December. The course will be held using the Blackboard training platform. Who is this course for? This course is for anyone who would to learn more about using Ensembl via the browser through webinar videos and exercises. An undergraduate […]

Job: Bioinformatician – parasite genomics

written 11 days ago by Ensembl Blog

WormBase ParaSite is looking for an experienced bioinformatician to work on resources for parasitic worm genomics. We’re looking for MSc/PhD in bioinformatics or a related subject, experience working with relational database systems and proficiency in at least two programming languages. Closes 23rd November. Location: EMBL-EBI, Hinxton near Cambridge, UK Staff Category: Staff Member Contract Duration: […]

Job: Bioinformatician (model organism genomics)

written 11 days ago by Ensembl Blog

Alliance of Genome Resources (AGR) is looking for an experienced bioinformatician to work on model organism genome resources. We’re looking for MSc/PhD in bioinformatics or a related subject, two years experience working in a software-oriented role and proficiency in at least two programming languages. Closes 19th November. Location: EMBL-EBI, Hinxton near Cambridge, UK Staff Category: […]

Job: Scientific Programmer

written 11 days ago by Ensembl Blog

We’re looking for an experienced developer to contribute to our variation resources. We’re looking for extensive programming experience (Perl or Python) and years’ experience working in a production development environment, ideally creating efficient large-scale data analysis pipelines, user-friendly tools and APIs. Closes 10th November. Location: EMBL-EBI, Hinxton near Cambridge, UK Staff Category: Staff Member Contract […]

10X Triples Down on Spatial Analysis

written 18 days ago by Omics! Omics! by Keith Robinson

10X Genomics last week announced the purchase of ReadCoor, a company that unveiled its 3D spatial sequencing instrument back at AGBT, paying $350 to acquire the Cambridge MA company. This follows quickly on the heels of 10X purchasing Swedish in situ sequencing company CartaNA for another $41M. 10X already had the Visium spatial transcriptomics product on the market. So now 10X has three different technologies in the spatial profiling space.Read more »

Disruption to Ensembl services

written 19 days ago by Ensembl Blog

Due to problems at our Data Centre some of our resources are currently running with reduced functionality. The services affected are: Tools (BLAST, VEP etc.) on, mirror sites (eg,,,,,,, and Biomart on User accounts on,,,, and Some of […]

Introducing the UniProt Alzheimer’s disease portal

written 23 days ago by Inside UniProt

INTRODUCTIONAlzheimer's disease (AD), the most common subtype of dementia, is the most prevalent neurodegenerative disorder with an estimated 30-35 million living with the disease worldwide. It is characterized by progressive memory loss, cognitive decline, and eventually leads to the loss of bodily functions and ultimately death. Although much is known about this complex disease, the underlying cause remains unclear. Current research suggests that the risk of developing AD is influenced by both genetic and environmental factors as well as age; although it is not a normal part of ageing. Despite considerable global scientific efforts into developing drugs, vaccines and other medical treatments, there are currently no effective medications for the prevention and treatment of AD. Since 1998, 146 drugs have been tested and rejected, and the four drugs that have been approved for therapeutic use only have modest symptom-reducing effects and do not alter the eventual progression of AD. It is therefore critical that the plethora of data generated by this research is collected, organized, freely-available and accessible to researchers, in order to increase the pace of discovery and innovation.UniProtTo better serve the needs of the AD research community and to facilitate discoverability, UniProt has developed the Alzheimer’s disease portal to help researchers explore and access current AD genomic-based data from the UniProtKB database, but in a single centralized UniProt disease portal. It is linked from the UniProt Alzheimer Disease page in the first beta release.Developed with the help of AD researchers, the portal incorporates UniProt functional annotations, protein network ...

Update to the Ensembl COVID-19 resource

written 25 days ago by Ensembl Blog

We are pleased to announce an updated release of the Ensembl SARS-CoV-2 genome browser, including new sequence variants generated from sequence data held in ENA and updated community annotation. The Ensembl SARS-CoV-2 genome browser was launched in May 2020 to support the global work to develop treatments, diagnostics and vaccines in response to the COVID-19 […]

New Insights into Human Gene Regulation

written 26 days ago by KidsGenomics

Understanding the impact of genetic variants on observable traits is a fundamental goal of human genetics. Yet for the &gt;98% of known sequence variants that reside outside of protein-coding sequences, this remains a significant challenge. There is considerable evidence that noncoding variation can and does impact observable phenotypes. Genome-wide association studies, for example, have pinpointed […] The post New Insights into Human Gene Regulation appeared first on KidsGenomics.

Keeping an Index on a Subtle Difference in Illumina Chemistries

written 4 weeks ago by Omics! Omics! by Keith Robinson

I like to pretend in this space that I catch all the little details of the different sequencing platforms. Well, at least over time I try to do that. But ego aside, that is often a mark not made. A bit of a year ago I discovered that there's a small difference across the Illumina family that is completely separate from how clusters are generated (Bridge Amplification randomly arrayed or Exclusion Amplification in nanowells) or the wavelengths of light used in the fluorescence microscopy (now blue on the newest NextSeqs, with superresolution microscopy coming soon) or 4 color vs. 2 color vs. 1-color (well, really staged 2-color) chemistry for the reversible terminators. There's a subtle difference in how the second index is read. I'm not spilling a deep secret: it's right out in plain sight within an Illumina technical documentRead more »

Retirement of Ensembl Pre! site

written 4 weeks ago by Ensembl Blog

The Ensembl pre! site will be retired on 31st January 2021. The Ensembl Pre! site was launched in 2003 to provide early access to preliminary data generated during the gene annotation process. Genome sequences and preliminary data were made available in the Pre! site, prior to full annotation and integration into Ensembl. Pre! data allowed […]

The lethal nonsense of Michael Levitt

written 5 weeks ago by Bits of DNA by Lior Pachter

It is not easy when people start listening to all the nonsense you talk. Suddenly, there are many more opportunities and enticements than one can ever manage.” – Michael Levitt, Nobel Prize in Chemistry, 2013 In 1990 Glendon MacGregor, a restaurant waiter in Pretoria, South Africa, set up an elaborate hoax in which he posed […]

The most important quality of a scientist

written 6 weeks ago by The Grand Locus

When I established my lab and started to recruit people, I thought that it would be interesting to gather some information about what makes a good or a bad scientist. To this end, I designed a short questionnaire with nine questions. There was no right or wrong, nor even a preferred answer. Those were just questions to help me know the candidates better. The first question was “What is the most important quality of a scientist?” I had no particular expectation. Actually, I did not even know my own answer to this question. As it turned out, most candidates answered that it was either creativity or persistence. If you have been in science for even a short while, you know why this makes sense. We have complicated problems to solve, so creativity and persistence are important. Yet, I was not convinced that a good scientist is someone who is either very creative or very persistent. The reason is that neither of these qualities defines a scientist. Artists, politicians, business people, social workers and pretty much everyone else greatly benefits from being creative or persistent. Having spent more time with scientists, I came to find the answer to my... Read more on the blog: The most important quality of a scientist

Sexual harassment case number 1,052

written 7 weeks ago by Bits of DNA by Lior Pachter

Despite much ado about the #metoo movement in recent years, the crisis of sexual harassment in academia persists without an end in sight. The academic sexual misconduct database now lists 1,051 cases, each of them a tragedy of trauma, unspeakable violations of victims, and dreams destroyed. I’ve written previously about two cases listed in the […]

Association-Rule-Based Annotator (ARBA) in UniProt

written 7 weeks ago by Inside UniProt

UniProt has developed an automatic annotation system to enhance unreviewed TrEMBL entries in the UniProt Knowledgebase (UniProtKB) by enriching them with automatically predicted annotations. In release 2020_04 of August 2020, a new powerful automated system called ARBA replaced the previous SAAS (Statistical Automatic Annotation System) system. ARBA is a multiclass learning system trained on expertly annotated entries in UniProtKB/Swiss-Prot. ARBA uses rule mining techniques to generate concise annotation models with the highest representativeness and coverage for annotation, based on the properties of InterPro group membership and taxonomy.ARBA currently generates around 23 thousand models, resulting in annotations for more than 85 million proteins including 35 million that lacked any previous annotation. Consequently, UniProtKB witnessed an increase in automatic annotation coverage from 35% to 50%. All ARBA rules can be accessed here and relevant rules are also tagged as evidence for annotations from UniProtKB entries.ARBA-based evidence for UniProtKB annotationWhen an annotation is added to an entry based on an automatic annotation from an ARBA rule, the evidence tag indicates this along with a link to the rule itself, for example, the protein entry Q4SML2 derives annotation from ARBA rule ARBA00000621.Browsing ARBA rulesIn order to browse the dataset to view rules of interest, click on the dropdown next to the search box in the UniProt website and select ‘ARBA’. Now enter a query and hit the search button.Exploring ARBA rule pagesConditions are listed on the left-hand side of the rule page and annotations are on the right-hand side. If a condition holds true, ...

What’s coming in Ensembl 102 / Ensembl Genomes 49

written 8 weeks ago by Ensembl Blog

Ensembl 102 (and Ensembl Genomes 49) are due to be released in October 2020. As with all releases, we cannot guarantee that anything listed here will make it into the final release. Major Data Updates Allele frequency data added for human variants from the NCBI Allele Frequency Aggregator (ALFA) New Genomes Plants: Saltwater cress (Eutrema […]

Self-Assessing My Python

written 8 weeks ago by Omics! Omics! by Keith Robinson

I've been programming 90+% in Python now for over a year and a half -- when I joined the Strain Factory I vowed to finally make the break from Perl. Partly this was disgust with so often finding libraries I wanted to be missing or broken, and partly it was recognizing that the Factory is primarily a Python shop and I would have the most impact if I worked in the lingua franca. I was first exposed to Python back at Codon Devices, but there was a strong C# faction there and I fell in love with that language, so my primary dabbling in Python was learning enough to glue the key Python code into my C# with IronPython. I strongly considered changing over at the start of Warp Drive, but gave it too weak a try and quickly started churning out Perl. I still use that language for basic level text munging, but have avoided writing nearly anything that occupies more than one screen.Read more »

Job: Computational Data Analyst

written 9 weeks ago by Ensembl Blog

We’re looking for a bioinformatician or computational biologist to prototype statistical and machine learning approaches to characterise the regions of the genome responsible for controlling gene expression. We’re looking for a degree in a quantitative field with subsequent relevant scientific experience, or a PhD. Closes 6th October. Location: EMBL-EBI, Hinxton near Cambridge, UK Staff Category: […]

Job: Web Full Stack Developer

written 9 weeks ago by Ensembl Blog

We’re looking for a talented developer with experience of building modern web applications, to contribute to our open source, next generation platform. We’re looking for familiarity with web development best practice and experience with some of the relevant languages and technologies, such as Python, GraphQL, Typescript and React. Closes 1st October. Location: EMBL-EBI, Hinxton near […]

Job: User Experience Analyst

written 9 weeks ago by Ensembl Blog

We’re looking for a talented UX and usability specialist with a particular interest in user research, user testing and solutions design. We’re looking for communication skills, familiarity with web development technologies and significant experience working in a user research centric role. Closes 1st October. Location: EMBL-EBI, Hinxton near Cambridge, UK Staff Category: Staff Member Contract […]

Ensembl 101 has been released!

written 10 weeks ago by Ensembl Blog

We’re pleased to announce the release of Ensembl 101, and the corresponding release of Ensembl Genomes 48. The highlights of this release include an update of the human gene annotation and new population frequency data along with 39 new genomes including a new sheep reference and crop cultivars. Major data updates for human Ensembl release […]

Job: Agricultural Genomics and Bioinformatics Team Leader

written 11 weeks ago by Ensembl Blog

We’re looking for a faculty level Team Leader to manage and develop genomic data resources dedicated to crop and livestock agriculture. We’re looking for MSc/PhD in bioinformatics, life sciences, computing or statistics, 6 years relevant work experience and 2 years experience in managing personnel. Closes 21st September. Location: EMBL-EBI, Hinxton near Cambridge, UK Staff Category: […]
