Blog posts collected by the Biostar aggregator. To follow subscribe to the planet feed
written 1 day ago by Ensembl Blog
Variants can be represented in myriad different ways; indeed, Ensembl VEP currently supports input in many different formats, including VCF, HGVS and SPDI. However, even within these specifications, variants can be described ambiguously. Insertions and deletions within repeated regions can be described at multiple different locations. For example, VCF describes variants using their most 5’ […]
written 6 days ago by Ensembl Blog
It’s now ten weeks since we have started working remotely because of the COVID-19 pandemic. The novelty of the situation is gone and we got accustomed to the ‘new normal’. We have worked on ongoing projects as well as new ones: We have released Ensembl 100 in April and a new COVID-19 resource earlier this […]
written 8 days ago by Ensembl Blog
Today, Ensembl has joined the international scientific effort to tackle the COVID-19 pandemic. COVID-19 is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which has spread rapidly since emerging in late 2019. Our SARS-CoV-2 genome browser and related resources at covid-19.ensembl.org are intended to support both basic research and ongoing work to develop […]
written 16 days ago by Ensembl Blog
Ensembl staff have been working from home due to the COVID-19 pandemic for eight weeks now. We transitioned to remote work from 16 March – two days before the closure of EMBL-EBI and a week before the Prime Minister announced the UK lockdown on 23 March. The transition to working from home has been relatively […]
written 21 days ago by Living in an Ivory Basement by Titus Brown
The feature that I'm most excited about in sourmash 3.3.0 is the ability to directly use compressed SBT search databases. Previously, if you wanted to search (say) 100,000 genomes from GenBank, you'd have to download a several GB .tar.gz file, and then uncompress it out to ~20 GB before searching it. The time and disk space requirements for this were major barriers for teaching and use. In v3.3.0, Luiz Irber fixed this by, first, releasing the niffler Rust library with Pierre Marijon, to read and write compressed files; second, replacing our old khmer Bloom filter nodegraph with a Rust implementation (sourmash PR #799); and, third, adding direct zip file storage (sourmash #648). So, as of the latest release, you can do the following: # install sourmash v3.3.0 conda create -y -n sourmash-demo \ -c conda-forge -c bioconda sourmash=3.3.0 # activate environment conda activate sourmash-demo # download the 25k GTDB release89 guide database (~1.4 GB) curl -L https://osf.io/5mb9k/download > gtdb-release89-k31.sbt.zip # grab a genome signature - here, download a demo one from OSF curl -L https://osf.io/vhnk4/download > genome.sig # search! sourmash search genome.sig gtdb-release89-k31.sbt.zip This takes less than 2 GB of disk space total (including conda env), and the search runs in about 3 seconds and 120 MB of RAM. Using the zip file stuff alone is a slight speed drag (~10-20%?), but the shift to Rust leads to an overall speed increase of about 4x. And you can always unpack the zip file and use the unpacked files directly. Yay! ...
written 28 days ago by Ensembl Blog
We are very excited to announce the release of Ensembl 100, along with Ensembl Genomes 47! Time has really flown for us. We moved from our beginnings as a browser with just one genome 20 years ago to an integrated resource for many species and data types in 2020. In this release we continue to […]
written 4 weeks ago by Ensembl Blog
If you are filtering a set of variants to look for those potentially involved in disease, your first stop will probably be databases of phenotype associations, like ClinVar. There is also a lot of valuable information on variant-disease associations in the literature, which may not yet have been extracted into curated databases. It can be […]
written 5 weeks ago by Living in an Ivory Basement by Titus Brown
Over the last 10-15 years, I've blogged periodically about how my lab develops research software and build scientific workflows. The last update talked a bit about how we've transitioned to snakemake and conda for automation, but I was spurred by an e-mail conversation into another update - because, y'all, it's going pretty well and I'm pretty happy! Below, I talk through our current practice of building workflows and software. These procedures work pretty well for our (fairly small) lab of people who mostly work part-time on workflow and software development. By far the majority of our effort is usually spent trying to understand the results of our workflows; except in rare cases, I try to guide people to spend at most 20% of their time writing new analysis code - preferably less. Nothing about these processes ensures that the scientific output is correct or useful, of course. While scientific correctness of computational workflows necessarily depends (often critically) on the correctness of the code underlying those workflows, the code could ultimately be doing the wrong thing scientifically. That having been said, I've found that the processes below let us focus much more cleanly on the scientific value of the code because we don't worry as much about whether the code is correct, and moreover our processes support rapid iteration of software and workflows as we iteratively develop our use cases. As one side note, I should say that the complexity of the scientific process is one thing that distinguishes research computing ...
written 6 weeks ago by Inside UniProt
Many of you that work in the lab have switched to working remotely. Though your daily routine and the continuity of your research might have been impacted, your contribution to knowledge can continue in new ways.Are you at home itching to contribute to science? UniProt to the rescue!Improve our resource for the community and receive credit for it.We have the proteins and you have the expertise. You can now use that expertise by adding publications to protein entries.What you need:1. ORCID, this is your researcher personal ID (used for validation and for credit)2. a protein of interest3. a publication with a PubMed ID (PMID) about the protein of interest. You don’t have to be the author of the publicationWhat to do (Figure 1):1. Identify the protein of interest in UniProt (note that this also includes proteins from the special UniProt COVID-19 website, which can be found at https://covid-19.uniprot.org/uniprotkb?query=*)2. Select “Add a publication” link on the top menu in the entry page3. Login with ORCID4. Fill in submission forma. Enter PubMed ID (PMID) to retrieve publicationb. Confirm that the publication is correct and it is about the protein of interestc. Select what topics the paper is aboutd. Add short statements about protein name, function, disease, or other, as described in the publicatione. Submit5. Reply to review questions, if any6. After review, check your publication on the website in next releaseFigure 1-From publication to UniProtKB entry.A sample blank submission form can be found here:https://community.uniprot.org/bbsub/sampleform.htmlThis is how your publication will be displayed on ...
written 6 weeks ago by Living in an Ivory Basement by Titus Brown
Today at lab meeting, I wanted to brainstorm about how to give good online talks, because I'm giving a few remote talks in the next month. Tracy suggested that perhaps I should demonstrate a bad talk first, just to get everyone on the same page. So I did! Direct (YouTube link) ...enjoy? It's short, and not TOO painful if you show up with low expectations! First, let me say that we were tremendously ...inspired by Greg Wilson's How to Teach Badly and How to Teach Badly (part 2)! So here's what I did -- I put together a few slides on some stuff that I'd been working on recently, so it would look reasonable. My initial screen opened with a private Twitter message up, to mimic inadvertent content sharing :). I started out with "I didn't have a lot of time to prepare for this meeting so apologies for some of the slides." My slide theme was very hard to read - bad fonts and colors. A few slides in I went with "I know we're all busy on time so I'm going to be brief. I'll just skip some of the background and through these first slides quickly." On the first slide with an image, I had Taylor Reiter break in to ask a question, and I shut her down with "Just hold questions, I'll get to them at the end of we have time." All of my slide content was just ...terrible. I am especially "proud" of the ...
written 8 weeks ago by Omics! Omics! by Keith Robinson
One of most truly useless pieces of information lodged in my brain is my zodiac sign; not once in my life have I had any interest in it. But, given the available draws, it isn't too bad, as it's also the name of perhaps the most underappreciated engineering project of the second half of the 20th Century: Project GeminiRead more »
written 8 weeks ago by KidsGenomics
The identification of novel disease genes sometimes overshadows another crucial form of genomic discovery: expanding the phenotype associated with known disease genes. This is especially important in the era of pervasive clinical genetic testing. Exome sequencing, which interrogates all ~20,000 protein-coding genes simultaneously, is rapidly becoming a frontline diagnostic test for patients with rare genetic […] The post The wide phenotypic spectrum of BICD2 variants in dominant SMA appeared first on KidsGenomics.
written 9 weeks ago by Omics! Omics! by Keith Robinson
My piece on the near amnesia in U.S. culture of the 1918-19 Influenza pandemic provoked a number of helpful comments, emails and conversations. While I would stand behind the statement that it left a light footprint, there are a number of interesting cases, some of which I would never have found by conventional means. Sometimes the collective wisdom of the internet is best for uncovering things, even when you're married to someone who catalogs books for a living.Read more »
written 10 weeks ago by Inside UniProt
Enzymes are essential for many biological processes. Without them, common tasks such as digesting food or replicating DNA would not be possible. In recent years, and in part triggered by the expansion of the analysis and annotation of complete genomes, it has become apparent that several enzyme families in a wide range of species contain members that look like enzymes but fail to behave like enzymes. For example, in human, several of these families have between 5 to 10% of these enzyme-like proteins. Whilst these proteins have sequences and 3D structure features similar to active enzymes, they tend to lack essential amino acid residues such as those involved in catalytic reactions and/or binding substrates, making them incapable of catalysing chemical reactions. Based on these characteristics, scientists decided to call them pseudoenzymes. Why are genes coding for pseudoenzymes maintained in the genome? It turns out that, despite their lack of enzymatic activity, this group of proteins carries out essential functions in cells. For example, they help assemble signalling cascades by acting as scaffolds, they regulate the activity of other enzymes and ensure that proteins are localized to the right cellular compartment. Consequently, they have become potential targets for the design of therapeutic treatments. To support the growing interest in pseudoenzyme biology, UniProt recently revisited this important group of proteins. In collaboration with the pseudoenzyme community, we implemented changes to enhance their identification and discoverability. The outcome of this project was published in two articles in Science signalling and FEBS journal . ...
This past fall there was a rumor that QIAGEN was being pursued by an acquirer, with the initial tip being scientific conglomerate ThermoFisher but then other possibilities floated by. QIAGEN was seen as ripe for such an action as their long-time CEO had stepped down. QIAGEN made a very public announcement that they would continue independently under their new CEO, but that is no longer the case: ThermoFisher will acquire them, pending regulatory approvals, for something around 11.5BRead more »
AGBT ended over a week ago and I've been procrastinating ever since in going through notes and writing up companies. First few days I had the excuse of family time on beautiful Sanibel Island to the north, but since Monday other than obsessing about COVID-19 (and cancelling travel plans) I have no excuses. First up, the microfluidic library prep company Miroculus, based on my notes from talking to their Chief Commercial Officer, Adam LoweRead more »
The still growing COVID-19 pandemic has reminded me of a question I've batted in my head a few times. In 1918 and 1919 a global influenza pandemic killed on the order of 50 million people worldwide. The scale of the jump in flu deaths in the U.S. can be seen in the below plot. That's more than the number of civilians and military personnel estimated to have been killed during World War I. Yet despite this, it would seem that there has been very little impact on culture (at least the culture I am aware of).Read more »
written 11 weeks ago by Living in an Ivory Basement by Titus Brown
written 11 weeks ago by Living in an Ivory Basement by Titus Brown
This winter quarter I taught my usual graduate-level introductory bioinformatics lab at UC Davis, GGG 201(b), for the fourth time. The course lectures are given by Megan Dennis and Fereydoun Hormozdiari, and I do a largely separate lab that aims to teach the basics of practical variant calling, de novo assembly, and RNAseq differential expression. I also co-developed and co-taught a new course, GGG 298 / Tools for Data Intensive Research, with Shannon Joslin, a graduate student here in Genetics & Genomics who (among other things) took GGG 201(b) the first time I offered it. GGG 298 is a series of ten half-day workshops where we teach shell, conda, snakemake, git, RMarkdown, etc - you can see the syllabus for GGG 298 here. This time around, I did a complete redesign of the GGG 201(b) lab (see syllabus) to focus on using snakemake workflows. I'm 80% happy with how it went - there's some overall fine tuning to be done, and snakemake has some corners that need more explaining than other corners, but I think the basic concepts got through to a lot of the students. I also think I'm finally teaching people something they really need to know, which is how to build, automate, place controls on, and execute complex bioinformatics workflows. I was traveling the week before last, so I asked Taylor Reiter and Tessa Pierce to do the first RNAseq lecture for the class (week 8!) As part of their brilliant RNAseq materials for the class (snakemake! ...
written 12 weeks ago by Ensembl Blog
We know that installing the VEP is not always trivial – there are dependencies and modules that you may or may not have already, and your existing setup may require different module versions. It’s also designed for a Linux system and installing on, for example, Windows, can be complex. To get around this, the VEP […]
written 3 months ago by Ensembl Blog
Can you believe it? Our next release will be Ensembl 100! We are planning to release it along with Ensembl Genomes 47 in late March 2020. As with all releases, please note that these are intentions and are not guaranteed to make it into the releases. Major Data Updates Update of Homo sapiens (Human) gene […]
At some fancy restaurants one can get a "deconstructed dish". As I understand it, as I don't frequent such restaurants, a deconstructed BLT would have the bread, bacon, lettuce and tomato each as their own individual item, but prepared in a novel way which highlights the strengths of each ingredient. When I got a preview last night of Rade Drmanac's closing AGBT talk on achieving a $100 human genome (reagents price only), that was the vision I had: Drmanac and his team have created their Tx system by deconstructing the optical high throughput sequencing-by-synthesis instrument.Read more »
Having summarized MGI's announcement they are launching into the U.S. market this spring and started digging into the performance characteristics of MGI's instrument lineup, let us now turn to their BioRxiv pre-print on the CoolMPS chemistry, as it has many useful technical details.Read more »
Friday morning I got excited because a preprint showed up at BioRxiv detailing the CoolMPS sequencing technology from MGI (aka BGI aka Complete Genomics). First announced in Fall 2018, this approach sounded, well, cool. Using fluorescently labeled antibodies specific to each reversible terminator seemed like a crazy pipe dream. So getting a good look at it in a manuscript is an event! But then Friday afternoon MGI had a second big pre-AGBT reveal: launch of their sequencing systems in the U.S. later this year. Below is a quick run-down of the sequencer announcement; the pre-print has many details I'm still parsing.Read more »
AGBT looms ahead of me next week which serves as impetus to let fly an idea I've had simmering for a while: to look at sequencing startups by a particular type of information they choose to reveal. I'm not expecting any big announcements at AGBT from this space, though would be thrilled to be surprised. But there is the risk of getting contaminated with some on-the-sly scuttlebutt, so better to get this done now. By the way, in the full disclosure category, I have consulted for a few companies here and have NDAs either on my own or via employers; everything here is based on public information.Read more »
News about the Ensembl Project and its genome browser
287 posts, last updated 1 day ago
bioinformatics education, metagenomics assembly, python programming
314 posts, last updated 21 days ago
News and commentary from the UniProt developers
44 posts, last updated 6 weeks ago
A computational biologist's personal views on new technologies & publications on genomics & proteomics and their impact on drug discovery
318 posts, last updated 8 weeks ago
KidsGenomics focuses on genetic diseases that affect children, including rare inherited disorders and pediatric cancers.
14 posts, last updated 8 weeks ago
bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"
50 posts, last updated 3 months ago
Reviews and commentary on computational biology
109 posts, last updated 4 months ago
My name is Guillaume Filion. I am a scientist who loves biology and mathematics. As of late I also got into computers and the Internet. I intend my blog to be recreational, and not academic nor educational. I wish you will find some of the posts inspiring for your own reflection.
33 posts, last updated 4 months ago
A news portal with postings about genomics resources, genomics news and research, science and more.
395 posts, last updated 7 months ago
Notes from the life of a computational biologist
121 posts, last updated 7 months ago
My worklog on bioinformatics, science and research. Small tasks and cute tricks included :)
35 posts, last updated 8 months ago
A wet lab biologist' bioinformatic notes. Mostly is about Linux, R, python, reproducible research, open science and NGS. I am into data science! I am working on glioblastoma (a terrible brain cancer) genomics at MD Anderson cancer center. Disclaimer: For posts that I copied from other places, credits go to the original authors.
53 posts, last updated 8 months ago
Bioinformatics tips, tricks, tools and commentary - all with a microbiological NGS bent. Authored by Dr Torsten Seemann from Melbourne, Australia.
31 posts, last updated 8 months ago
Bioinformatics tips, tricks, tools and commentary with a microbial genomics bent. Written by Torsten Seemann from Melbourne, Australia.
27 posts, last updated 8 months ago
Weblog on Bioinformatics, Genome Science and Next Generation Sequencing
74 posts, last updated 13 months ago
bioinformatics, biopython, genomic analysis
39 posts, last updated 15 months ago
Making Sense of Next-Gen Sequencing Data
89 posts, last updated 16 months ago
Biology, sequencing, bioinformatics and more
33 posts, last updated 18 months ago
Tips && tricks from a cluster of bioinformaticians
38 posts, last updated 20 months ago
miRBase news and views
15 posts, last updated 24 months ago
Medical genomics in the post-genome era
71 posts, last updated 2.2 years ago
Assistant professor of computer science at University of Iceland.
9 posts, last updated 2.3 years ago
This blog is written by Giovanni M. Dall’Olio, a research associate at the Cancer Evolutionary Genomics‘s group of Francesca Ciccarelli at the King’s College of London. My primary interests are in the system biology of cancer and in identifying new potential drug targets for this disease.
13 posts, last updated 2.8 years ago
Getting Things Done in Genetics & Bioinformatics Research
60 posts, last updated 3.3 years ago
virology, bioinformatics, genetics, science, java
65 posts, last updated 3.4 years ago
Thoughts and opinions from the associate director of the EMBL-European Bioinformatics Institute
57 posts, last updated 3.4 years ago
Frontier in Bioinformatics
185 posts, last updated 4.1 years ago
bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"
78 posts, last updated 4.3 years ago
21 posts, last updated 5.5 years ago
A group blog providing expert, independent commentary on the personal genomics industry.
34 posts, last updated 5.6 years ago
The Knight Lab at Yale University » BlogThe Knight Lab at Yale University » Bioinformatics and Genetics
Bioinformatics and Genetics
3 posts, last updated 5.8 years ago
25 posts, last updated 6.5 years ago
Mostly bioinformatics, NGS, and cat litter box reviews
25 posts, last updated 6.7 years ago
Powered by Biostar version 2.3.0
Traffic: 1381 users visited in the last hour