Blog posts collected by the Biostar aggregator. To follow subscribe to the planet feed
written 1 day ago by Omics! Omics! by Keith Robinson
In the closing talk of the pre-London Calling workshop, Hans Jansen had closed his presentation with a question whether at some future date sequence assembly would become obsolete. This was meant to be an aspirational vision for a distance timepoint, but one correspondent on Twitter saw it as hype. I got in a bit of a discussion, constrained by the dreaded 140 character limit, which ended up largely illustrating that I have a somewhat more restricted definition of assembly than some people. I'm going to explore this and you can judge for yourselfRead more »
written 1 day ago by Diving into Genetics and Genomics
Downloading dataUsing TCGAbiolinks, I downloaded RNAseq data for LUAD and LUSClibrary(TCGAbiolinks)library(SummarizedExperiment)# query_rna_LUAD.hg38 <- GDCquery(project = "TCGA-LUAD", data.category = "Transcriptome Profiling",# data.type = "Gene Expression Quantification", # workflow.type = "HTSeq - Counts")# # # query_rna_LUSC.hg38 <- GDCquery(project = "TCGA-LUSC", data.category = "Transcriptome Profiling",# data.type = "Gene Expression Quantification", # workflow.type = "HTSeq - Counts")# # GDCdownload(query_rna_LUAD.hg38, method = "client")# GDCdownload(query_rna_LUSC.hg38, method = "client")# # LUAD_rna_data <- GDCprepare(query_rna_LUAD.hg38)# LUSC_rna_data <- GDCprepare(query_rna_LUSC.hg38)# I have saved both R objects into diskload("~/projects/mix_histology/data/TCGA_rna/TCGA_lung_rna.rda")# a RangedSummarizedExperiment objectLUSC_rna_data## class: RangedSummarizedExperiment ## dim: 57035 551 ## metadata(0):## assays(1): HTSeq - Counts## rownames(57035): ENSG00000000003 ENSG00000000005 ...## ENSG00000281912 ENSG00000281920## rowData names(3): ensembl_gene_id external_gene_name## original_ensembl_gene_id## colnames(551): TCGA-77-8009-01A-11R-2187-07## TCGA-34-5239-01A-21R-1820-07 ... TCGA-NK-A7XE-01A-12R-A405-07## TCGA-43-6773-11A-01R-1949-07## colData names(69): patient barcode ...## subtype_Homozygous.Deletions subtype_Expression.SubtypeThe problem is with the meta dataLUSC_coldata<- colData(LUSC_rna_data)LUAD_coldata<- colData(LUAD_rna_data)we will see the different representations of NAstable(LUAD_coldata$subtype_Smoking.Status, useNA = "ifany")## ## Current reformed smoker for > 15 years ## 78 ## Current reformed smoker for < or = 15 years ## 77 ## Current smoker ## 47 ## Lifelong Non-smoker ## 32 ## [Not Available] ## 11 ## <NA> ## 349table(LUSC_coldata$subtype_Smoking.Status, useNA = "ifany")## ## Current reformed smoker for > 15 years ## 51 ## Current reformed smoker for < or = 15 years ## 87 ## Current smoker ## 28 ## Lifelong Non-smoker ## 7 ## N/A ## 6 ## <NA> ## 372we see NAs are represented either as real <NA>, [Not Avaiable] or N/A. The first thing to do is to tidy the metadata changing all NAs to <NA>:I will use stringr from the ...
written 2 days ago by thoughts about ...
Recently started developing an R package with focus on Bioconductor. Had a lot of experience in using R, but package development in this language is a novel area for me. Got into very strange problem: testing code of function out of package works well, but activating this function from prebuilt package leads to error. The problem was with usage of "intersect" base operation. By default it was loaded from sets library, while was required from GRanges in my case.Spend some time to figure this out, but thanks to this post fixed it. Issue was related to additional marking of function origin in DESCRIPTION. Basically, additional mark was required in Roxygen function annotation:#' @importMethodsFrom GenomicRanges intersectAlso, one thing that I had to control - frequent reload of library to run testing. The following command works well:detach("package:InTAD", unload = TRUE)Of course, easier testing can be organized using testthat, but initial coding requires simple reload. These are only minor issues, but closer to Bioconductor submission attempt I will try to give more overview.
written 4 days ago by The OpenHelix Blog
This week, quite a range. From crops in silico, to antique tumor samples to assess cancer genomes. Even older: assessing ancentral variants to battle todays viruses. That’s right–we are standing on the past and in the future. Deadly tropical snails. At NCBI–the end of BLink. And other useful things. Welcome to our Friday feature link collection: […]
written 4 days ago by Ensembl Blog
We’re really excited to be a part of the ESHG conference again, this time in Copenhagen from the 27th-30th May. We can’t wait to see all the great science that’s going to be presented, but here’s a guide to the talks, workshops Continue reading What we’re up to at ESHG 2017→
written 5 days ago by Omics! Omics! by Keith Robinson
Okay, I'm desperately behind on writing up the external science from London Calling. Not helpful that I claimed I would not only do so, but in multiple installments. A number of the plenaries focused on large genome assembly, so that's what I'll tackle now -- plus a few other bits. See also my Storify summaries, which include other reports on the conference. Also check out my storifies on the SMRT Leiden conference, which ran at the beginning of the same week and discusses many similar topics.Read more »
written 6 days ago by The OpenHelix Blog
This week’s video tip is…well…atypical. It’s not about a software tool, per se. Or, is it…?
written 8 days ago by Omics! Omics! by Keith Robinson
Jonathan Jacobs posted his annual reminder that the Sequencing, Finishing and Analysis in the Future Meeting (SFAF) will be this week. Alas, that meeting hasn't had many more tweeters in the past than Jonathan, but perhaps this year there will be more. There's a glut of genomics conferences to track, compile tweets and opine on -- besides London Calling, there's been SMRT Leiden and Biology of Genomes, all in the span of two weeks! This post is going to be a bit short on actual writing and more to just flag some talks at SFAF that grabbed my attention. What I realized is that the talks at SFAF illustrate that a number of technologies I consider effectively dead retain significant attention. #ImBiased, but… Best conf. of 2017: #SFAF2017 #infectiousdisease #inherited #disease #agrigenomics #human #genomics https://t.co/yTu2MxKc41 pic.twitter.com/FCoSmTp6an— Jonathan Jacobs (@bioinformer) May 10, 2017 Read more »
written 9 days ago by The Grand Locus
This post is part of a series of tutorials on indexing methods based on the Burrows-Wheeler transform. The first part describes the theoretical background, the second part shows a naive C implementation, and this part shows a more advanced implementation with compression. The code is written in a very naive style, so you should not use it as a reference for good C code. Once again, the purpose is to highlight the mechanisms of the algorithm, disregarding all other considerations. That said, the code runs so it may be used as a skeleton for your own projects. The code is available for download as a Github gist. As in the second part, I recommend playing with the variables, and debugging it with gdb to see what happens step by step. Constructing the suffix array First you should get familiar with the first two parts of the tutorial in order to follow the logic of the code below. The file learn_bwt_indexing_compression.c does the same thing as in the second part. The input, the output and the logical flow are the same, but the file is different in many details. We start with the definition of the occ_t... Read more on the blog: A tutorial on Burrows-Wheeler indexing methods (3)
written 9 days ago by Living in an Ivory Basement by Titus Brown
This blog post stems from notes I made for a 12 minute talk at the Oregon State Microbiome Initiative, which followed from some previous thinking about data integration on my part -- in particular, Physics ain't biology (and vice versa) and What to do with lots of (sequencing) data. My talk slides from OSU are here if you're interested. Thanks to Andy Cameron for his detailed pre-publication peer review - any mistakes remaining are of course mine, not his ;). Note: During the events below, I was just a graduate student. So my perspective is probably pretty limited. But this is what I saw and remember! My graduate work was in Eric Davidson's lab, where we studied early development in the sea urchin. Eric had always been very interested in gene expression, and over the preceding decade or two (1980s and onwards) had invested heavily in genomic technologies. This included lots of cDNA macroarrays and BAC libraries, as well as (eventually) the sea urchin genome project. The sea urchin is a great system for studying early development! You can get literally billions of synchronously developing embryos by fertilizing all the eggs simultaneously; the developing embryo is crystal clear and large enough to be examined using a dissecting scope; sea urchins are available world-wide; early development is mostly invariant with respect to cell lineage (although that comes with a lot of caveats); and sea urchin embryos have been studied since the 1800s, so there was a lot of background literature on ...
written 11 days ago by Ensembl Blog
This is the second instalment of our monthly posts introducing a member of the Ensembl team, and what they do in Ensembl. This time, it’s Will McLaren who works in the Variation team. What is your job in Ensembl? I’m the principal developer in Continue reading Getting to know us: Will from Variation→
written 11 days ago by The OpenHelix Blog
This week, DNA was indicted and decades later led to a conviction. Genomes of birch trees and shape-shifting butterflies. And the most interesting stuff to me is non-human, but dbSNP will stop accepting non-human species info. Sigh. Well, I do think alternative splicing is interesting too, and we have some of that this week. Human […]
written 13 days ago by Omics! Omics! by Keith Robinson
London Calling 2017 came to a close last Friday. Any excuses of jet lag or nights running up ONT's bar tab won't hold up much longer, so time to finish this post (I really did start the night after Clive's talk!) I'm going to largely divide coverage on the dividing line of who presented: today's piece on Oxford Nanopore presentations, particularly Clive Brown's, and in the near future at least one focusing on the science users presented. For other summaries of the action, I've created a storify of just blog posts and similar summaries of the meeting, as there were a great number (and I am on the hunt for additional ones I've missed)Read more »
written 16 days ago by The Grand Locus
written 18 days ago by Bioinformatics I/O
Kraken is a really good k-mer based classification tool. I frequently use this tool for viral signal detection in metagenomic samples. A number of useful scripts such as updating Kraken databases are provided with the tool. Since the NCBI updated the FTP website structure and decided to phase-out Genbank Idenfiers (GIs), the default Kraken database update scripts […]
written 18 days ago by The OpenHelix Blog
This week is a pretty eclectic set of things. Ancestry and African Americans, oral history of genomics researchers, tools for various types of analyses, and another genome that created a caffeine pathway that’s so crucial to me. The most unusual item is probably that Microbiome board game. We live in interesting times. Welcome to our Friday […]
written 19 days ago by Omics! Omics! by Keith Robinson
I attended on Wednesday the London Calling pre-conference workshop, an add-on for those wishing for help getting started with MinION sequencing. Judging from who I spoke to, many participants were utterly new to nanopore sequencing and more than a few were like me in that they had tried the platform and wanted to do better. My colleague has gotten some very good results recently, which has re-fired my determination to get good at that myself. Below are some limited notes I took that may be of general interest. Large portions of the workshop will go largely uncovered, as I focused on what was surprising or new.Read more »
written 19 days ago by Next Gen Seek
London Calling, the two day annual conference on all things Nanopore is happening right now. Among the interesting talks, Clive Brown, Oxford Nanopore’s CTO gave a talk titled “Some mundane and incremental types”. The title is the winner for the most understated title in ultra-hyped world. Thanks to tonnes of live tweeters from “London Calling […]
written 21 days ago by Omics! Omics! by Keith Robinson
Oxford Nanopore's London Calling confab runs Thursday and Friday, with a training workshop on Wednesday. I'll be there -- who can resist a conference nearly at the Tower of London? -- and will also be testing whether my personal "field of nanopore sequencing suppression" can defeat ONT's best trainers. Here's some preview of what I'll be particularly looking for, though being surprised will be lots of fun too. Much more fun that reading (the wrong) patents!Read more »
written 22 days ago by MassGenomics by Dan Koboldt
Exome sequencing continues to displace traditional panel-based approaches to genetic diagnosis. The maturity and efficiency of current exome kits, together with the ever-moving target of bona fide disease genes, make a strong argument for sequencing the exons of all genes, rather than the ones currently known to be associated with a specific phenotype. Indeed, many clinical laboratories […]
written 22 days ago by Omics! Omics! by Keith Robinson
Oxford Nanopore has launched lawsuits in the UK and Germany against Pacific Biosciences, alleging infringement of a European patent licensed from Daniel Branton's lab at Harvard, EP1192453, which is apparently exclusively licensed to Oxford. When I wrote about Pacific Biosciences first lawsuit against Oxford Nanopore late last year I titled it "PacBio's Quixotic Patent Litigation", as it appeared the Oxford could easily dodge the lawsuit by abandoning the 2D sequencing technology, which Oxford is in the process of doing. I've swapped in "enigmatic" for this title, as I'm not even sure what aspect of PacBio is allegedly infringing the patent.Read more »
written 25 days ago by Opinionomics by Mick Watson
As many of you will be aware, I like to post some R code, and I especially like to post base R versions of ggplot2 things! Well these amazing boxplots turned up on github – go and check them out! So I did my own version in base R – check out the code here and […]
written 25 days ago by The OpenHelix Blog
This week has a duplication event–I was on the road to the March for Science last week, and wasn’t able to post the SNPpets. But I stood for plant science at the march (see photo below). It was raining cats and dogs–and this week the snips also include cat and dog microbiomes. It has the peep […]
written 26 days ago by Omics! Omics! by Keith Robinson
A pretty common question over on Quora is something along the lines of "how do I learn bioinformatics". Great question! Tonight I'm going to outline a project which I think would make a good first bioinformatics project. It is rich in content and keys off an interesting new non-computational result. And since I've left graffiti on multiple Quora threads that I would write something like this in the immediate future, here it is!Read more »
written 27 days ago by The OpenHelix Blog
For all the years we’ve been out doing training on the UCSC Genome Browser tools, we could watch the evolution of the needs of the researchers and the corresponding features of the UCSC Genome Browser site. At first, people just needed access to the public data. But then they needed ways to add their own […]
A computational biologist's personal views on new technologies & publications on genomics & proteomics and their impact on drug discovery
199 posts, last updated 1 day ago
A wet lab biologist' bioinformatic notes. Mostly is about Linux, R, python, reproducible research, open science and NGS. I am into data science! I am working on glioblastoma (a terrible brain cancer) genomics at MD Anderson cancer center. Disclaimer: For posts that I copied from other places, credits go to the original authors.
35 posts, last updated 1 day ago
My worklog on bioinformatics, science and research. Small tasks and cute tricks included :)
34 posts, last updated 2 days ago
A news portal with postings about genomics resources, genomics news and research, science and more.
374 posts, last updated 4 days ago
News about the Ensembl Project and its genome browser
113 posts, last updated 4 days ago
bioinformatics education, metagenomics assembly, python programming
194 posts, last updated 9 days ago
My name is Guillaume Filion. I am a scientist who loves biology and mathematics. As of late I also got into computers and the Internet. I intend my blog to be recreational, and not academic nor educational. I wish you will find some of the posts inspiring for your own reflection.
27 posts, last updated 9 days ago
Tips && tricks from a cluster of bioinformaticians
12 posts, last updated 18 days ago
Making Sense of Next-Gen Sequencing Data
83 posts, last updated 19 days ago
Medical genomics in the post-genome era
63 posts, last updated 22 days ago
bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"
39 posts, last updated 25 days ago
Weblog on Bioinformatics, Genome Science and Next Generation Sequencing
63 posts, last updated 5 weeks ago
News and commentary from the UniProt developers
27 posts, last updated 6 weeks ago
Reviews and commentary on computational biology
76 posts, last updated 6 weeks ago
Notes from the life of a computational biologist
84 posts, last updated 9 weeks ago
Biology, sequencing, bioinformatics and more
31 posts, last updated 10 weeks ago
Getting Things Done in Genetics & Bioinformatics Research
60 posts, last updated 3 months ago
virology, bioinformatics, genetics, science, java
65 posts, last updated 4 months ago
Thoughts and opinions from the associate director of the EMBL-European Bioinformatics Institute
57 posts, last updated 5 months ago
This blog is written by Giovanni M. Dall’Olio, a research associate at the Cancer Evolutionary Genomics‘s group of Francesca Ciccarelli at the King’s College of London. My primary interests are in the system biology of cancer and in identifying new potential drug targets for this disease.
12 posts, last updated 5 months ago
Bioinformatics tips, tricks, tools and commentary with a microbial genomics bent. Written by Torsten Seemann from Melbourne, Australia.
25 posts, last updated 8 months ago
Bioinformatics tips, tricks, tools and commentary - all with a microbiological NGS bent. Authored by Dr Torsten Seemann from Melbourne, Australia.
29 posts, last updated 8 months ago
Assistant professor of computer science at University of Iceland.
8 posts, last updated 11 months ago
Frontier in Bioinformatics
185 posts, last updated 12 months ago
bioinformatics, biopython, genomic analysis
38 posts, last updated 13 months ago
bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"
78 posts, last updated 16 months ago
21 posts, last updated 2.4 years ago
A group blog providing expert, independent commentary on the personal genomics industry.
34 posts, last updated 2.6 years ago
The Knight Lab at Yale University » BlogThe Knight Lab at Yale University » Bioinformatics and Genetics
Bioinformatics and Genetics
3 posts, last updated 2.8 years ago
miRBase news and views
14 posts, last updated 2.8 years ago
25 posts, last updated 3.5 years ago
Mostly bioinformatics, NGS, and cat litter box reviews
25 posts, last updated 3.7 years ago
Powered by Biostar version 2.3.0
Traffic: 1296 users visited in the last hour