Blog posts collected by the Biostar aggregator. To follow subscribe to the planet feed
<prev • 2,051 results • page 1 of 83 • next >

use bash associate array to generate commands

written 1 day ago by Diving into Genetics and Genomics

The problemI am running pyclone recently to determine the clonality of our collected tumor samples. It needs tumor purity (estimated from sequenza) for input. I have 24 samples, and I want to generate pyclone commands for all of them.The solutionI usually generate command by bash and then use another bash wrapper to generate pbs files on HPC ((Thanks to @SBAmin). now for each patient, I have two samples. How should I generate the commands? I am still better in R and Unix than python, so I used the associated array in bash.First, generate a file containing the tumor purity for each tumor:head -6 all_tumor_purity_no_header.txt0.69 Pa25T10.26 Pa25T20.49 Pa26T10.37 Pa26T20.9 Pa27T10.92 Pa27T2This bash script uses associate array to contain tumor purity. read more at! /bin/bashset -euo pipefail## build the array to contain the tumor purity, like a python dict## have to declare by -Adeclare -A colswhile read purity sampledo cols[$sample]=$puritydone &lt; all_purity_no_header.txt echo ${cols[@]}## generate commandsfor i in Pa{25..37}do echo PyClone run_analysis_pipeline --in_file ${i}T1_pyclone.tsv ${i}T2_pyclone.tsv --tumour_contents ${cols[${i}T1]} ${cols[${i}T2]} --samples ${i}T1 ${i}T2 --density pyclone_binomial --working_dir ${i}T_pyclone_analysis --min_cluster_size 2 --seed 123 --num_iters 50000 &gt; ${i}_pyclone_commands.txtdonechmod u+x you get: cat *commands.txtPyClone run_analysis_pipeline --in_file Pa25T1_pyclone.tsv Pa25T2_pyclone.tsv --tumour_contents 0.69 0.26 --samples Pa25T1 Pa25T2 --density pyclone_binomial --working_dir Pa25T_pyclone_analysis --min_cluster_size 2 --seed 123 --num_iters 50000PyClone run_analysis_pipeline --in_file Pa26T1_pyclone.tsv Pa26T2_pyclone.tsv --tumour_contents 0.49 0.37 --samples Pa26T1 Pa26T2 --density pyclone_binomial --working_dir Pa26T_pyclone_analysis --min_cluster_size 2 --seed 123 --num_iters 50000PyClone run_analysis_pipeline --in_file Pa27T1_pyclone.tsv Pa27T2_pyclone.tsv --tumour_contents 0.9 0.92 --samples Pa27T1 Pa27T2 --density pyclone_binomial --working_dir Pa27T_pyclone_analysis --min_cluster_size 2 --seed 123 --num_iters 50000....then:use the makemsub ...

Friday SNPpets

written 1 day ago by The OpenHelix Blog

This week’s SNPpets include 2 weeks of tidbits. I was at the AAAS meeting last week. It’s not the best meeting for software–but I find myself currently seeking the thoughts of wise people in science communication and science policy on the state of play in the US. I went to so many CRISPR talks, Fake News talks, […]

Request for Compute Infrastructure to Support the Data Intensive Biology Summer Institute for Sequence Analysis at UC Davis

written 6 days ago by Living in an Ivory Basement by Titus Brown

Note: we were just awarded this allocation on Jetstream for DIBSI. Huzzah! Abstract: Large datasets have become routine in biology. However, performing a computational analysis of a large dataset can be overwhelming, especially for novices. From June 18 to July 21, 2017 (30 days), the Lab for Data Intensive Biology will be running several different computational training events at the University of California, Davis for 100 people and 25 instructors. In addition, there will be a week-long instructor training in how to reuse our materials, and focused workshops, such as: GWAS for veterinary animals, shotgun environmental -omics, binder, non-model RNAseq, introduction to Python, and lesson development for undergraduates. The materials for the workshop were previously developed and tested by approximately 200 students on Amazon Web Services cloud compute services at Michigan State University's Kellogg Biological Station from 2010 and 2016, with support from the USDA and NIH. Materials are and will continue to be CC-BY, with scripts and associated code under BSD; the material will be adapted for Jetstream cloud usage and made available for future use. Keywords: Sequencing, Bioinformatics, Training Principal investigator: C. Titus Brown Field of science: Genomics Resource Justification: We are requesting 100 m.medium instances with 6 cores, 16 GB RAM, and 130 GB VM space each for each instructor and student for 4 weeks. The total request is for 432,000 service units (6 cores * 24 hrs/day * 30 days * 100 people). To accommodate large size data files, an additional 100 GB of storage volumes ...

#AGBT17 Tweet Archive is Up!

written 7 days ago by Omics! Omics! by Keith Robinson

I've used my scheme for collecting and organizing tweets to capture most of the feed from this week's AGBT17 conference. I still need to pore over these in detail, so I won't try to distill out much thoughts (other than single-cell sequencing is clearly in exponential growth phase!).Read more »

Free online course – mark II

written 8 days ago by Ensembl Blog

Following the success of last year’s course, we’re pleased to announce a second Free Ensembl Webinar Course. This course allows you to learn about Ensembl for free from the comfort of your own office (or bed, no-one’s judging you), with Continue reading Free online course – mark II→

Twitter Coverage of the Lorne Genome Conference 2017

written 9 days ago by What You're Doing Is Rather Desperate by Neil Saunders

Things to know about Lorne in the state of Victoria, Australia. It’s situated on the Great Ocean Road, a major visitor attraction and a great way to see the scenic coastline of the region It’s home to a number of life science conferences including Lorne Genome 2017 This week’s project then: use R to analyse … Continue reading Twitter Coverage of the Lorne Genome Conference 2017

My thoughts for "Imagining Tomorrow's University"

written 11 days ago by Living in an Ivory Basement by Titus Brown

So I've been invited to Imagining Tomorrow's University, and they have this series of questions they'd like me to answer. (Note that you can follow the conversation at #TomorrowsUni on Twitter.) Conveniently I already answered many of these questions in my "What is Open Science?" blog post. I've copy/pasted from that for the first two answers. Q: What is your two sentence definition of open science (or open research)? A: Open science is the philosophical perspective that sharing is good and that barriers to sharing should be lowered as much as possible. The practice of open science is concerned with the details of how to lower or erase the technical, social, and cultural barriers to sharing. Q: Why is open science important for transforming research and learning? A: The potential value of open science should be immediately obvious: easier and faster access to ideas, methods, and data should drive science forward faster! But open science can also aid with reproducibility and replication, decrease the effects of economic inequality in the sciences by liberating ideas from subscription paywalls, and provide reusable materials for teaching and training. Q: How can open science increase the societal impact of university research? A: I have two answers. first, if open science accelerates research progress, then that increases the societal impact intrinsically. second, serendipity will strike. Most of my "wins" from open science have been unexpected - people using our research products in ways I never could have predicted or intended. This is really only possible ...

Bagging Novel Enzymes Via Mass Spec Metabolomics

written 12 days ago by Omics! Omics! by Keith Robinson

Obtaining a complete genome sequence for a bacterium or archean is essentially a solved problem, if you can culture the bug. Grow up biomass, purify the DNA and then use PacBio alone or a combination of long reads (PacBio or Oxford Nanopore) and short reads. These should yield a closed genome with a very low error rate. A few bugs spit at you by repeated failing PacBio sequencing or having some monster prophage or other repeat that is longer than the read lengths, but these are very rare. With advances in metagenomics techniques, the solving of uncultured genomes is becoming increasingly easy and many of these remarks also apply to fungi and other eukaryotic microorganisms. Once you have the sequence, then the lack of introns in bacteria and archea makes gene prediction almost trivial, and you now have a parts list for the organism. But is that a useful parts list? A new paper in Nature Methods makes some progress in improving the utility of those parts lists, though we are still far from actually fully understanding an organism given its genome.Read more »

My aperiodic rhombic bathroom

written 14 days ago by Bits of DNA by Lior Pachter

In 1997 physicist Roger Penrose sued Kimberly-Clark Corporation for infringing on his “Penrose patent” with their Kleenex-Quilted toilet paper. He won the lawsuit but fortunately for lavaphiles the patent has expired leaving much room for aperiodic creativity in the bathroom. Math is involved in many aspects of house design (two years ago I wrote about how math is even […]

Friday SNPpets

written 15 days ago by The OpenHelix Blog

This week’s SNPpets include the highly topical human migration issues–those that weren’t prohibited by the White House, at least. Speaking of charged issues–the NAS releases the CRISPR-human editing report next week. Also this week–human food crop resources and papers. Quinoa! But one that’s truly key for science is the coffee genome, of course. I’ll bet a […]

On the passing of Hans Rosling

written 16 days ago by What You're Doing Is Rather Desperate by Neil Saunders

It would be remiss not to mention briefly the passing of Hans Rosling. Data needs storytellers and the world needs advocates for evidence-based decision making. We have lost one of the best. For some insights into the man and his interesting (and at times challenging) life, I highly recommend this news feature. You can enjoy … Continue reading On the passing of Hans Rosling

offline plotly Gantt plots using Python/pandas

written 18 days ago by Kevin's GATTACA World

modified from to do offline and outside of ipython

Fired up! Blogging resumes soon.

written 18 days ago by The OpenHelix Blog

If you were a regular reader of the blog, you may have noticed the post last summer about my blogging sabbatical. I had volunteered to be the “solar coach” for my community, helping my neighbors to learn more about the possibility of having solar power on their homes. It was a fantastic experience. I did […]

Instructor training for teaching computational workshops

written 18 days ago by Living in an Ivory Basement by Titus Brown

As part of our Summer Institute in Data Intensive Biology, we will be running a week-long instructor training from June 18 to June 25 at the University of California, Davis. The instructor training will include the following -- Hands-on training in good pedagogical practice! Become a certified Software/Data Carpentry instructor! Learn to repurpose and remix online training materials for your own needs! This workshop is intended for people interested in teaching, reusing and repurposing the Software Carpentry, Data Carpentry, or Analyzing Next-Generation Sequencing Data materials. We envision this course being most useful to current teaching-intensive faculty, future teachers and trainers, and core facilities that are developing training materials. The workshop fee will be $350 for the week. Applications will close March 17th. Please see for more information, and contact if you have questions or suggestions. --titus

Hyetographs, hydrographs and highcharter

written 19 days ago by What You're Doing Is Rather Desperate by Neil Saunders

Dual y-axes: yes or no? What about if one of them is also reversed, i.e. values increase from the top of the chart to the bottom? Judging by this StackOverflow question, hydrologists are fond of both of these things. It asks whether ggplot2 can be used to generate a “rainfall hyetograph and streamflow hydrograph”, which … Continue reading Hyetographs, hydrographs and highcharter

2017 - A two-week summer workshop on high-throughput sequencing data analysis!

written 19 days ago by Living in an Ivory Basement by Titus Brown

I am pleased to announce that we will be running a two-week summer workshop on analyzing high-throughput sequencing data! This workshop will run from June 26-July 8th, 2017, and it is an continuation of the two-week NGS workshop run at Michigan State University since 2010. (You can read about the 2015 workshop here.) With our move to UC Davis, this year we will be able to take 3-4x as many applicants as in previous years! ANGUS: Analyzing High Throughput Sequencing Data June 26-July 8, 2017 University of California, Davis Zero-entry - no experience required or expected! Hands-on training in using the UNIX command line to analyze your sequencing data. Friendly, helpful instructors and TAs! Summer sequencing camp - meet and talk science with great people! Now in its eighth year! The workshop fee will be $500 for the two weeks, and on-campus room and board is available for $500/week. Applications will close March 17th. International and industry applicants are more than welcome! Please see for more information, and contact if you have questions or suggestions. --titus

Nice graphic? Are they taking the p…

written 21 days ago by What You're Doing Is Rather Desperate by Neil Saunders

Yes, it started with a tweet: Nice graphic on urine components via — Metabolomics (@metabolomics) January 31, 2017 By what measure is this a “nice graphic”? First, the JPEG itself is low-quality. Second, it contains spelling and numerical errors (more on that later). And third…do I have to spell this out…those are 3D … Continue reading Nice graphic? Are they taking the p…

Could Hermione Tackle MinION Yield Variability?

written 22 days ago by Omics! Omics! by Keith Robinson

A bit of a foray into Oxford Nanopore land again. By replacing a bench bumbler with someone competent, we've seen some success with our MinION at Starbase. Highly variable yields though. I've done some looking and discovered this isn't a unique experience. And now Oxford is suggesting that software upgrades alone will give MinION about another 50% boost in yield; it will be interesting to see what this does for variability. Finally, I have a notion of some of the sources of variability and an idea for a troubleshooting toolRead more »

What’s coming in Ensembl release 88

written 22 days ago by Ensembl Blog

Ensembl 88 is scheduled for release in March 2017. Read on for more information: Ensembl 88 is scheduled for release in March 2017. Highlights include: Updated assemblies, gene sets and annotations: Human: the human gene set will be updated to Continue reading What’s coming in Ensembl release 88→

Illumina Drops NeoPrep

written 23 days ago by Omics! Omics! by Keith Robinson

At the 2015 AGBT meeting, Illumina launched the NeoPrep, a ~$40K instrument to automate the preparation of up to 16 sequencing libraries at a time, using a technology called electrowetting microfludics. Now news comes that Illumina is dropping the NeoPrep, halting sales immediately and allowing existing users about a year of reagents. What happened and how does it impact genomics?Read more »

The real meaning of spurious correlations

written 23 days ago by What You're Doing Is Rather Desperate by Neil Saunders

Like many data nerds, I’m a big fan of Tyler Vigen’s Spurious Correlations, a humourous illustration of the old adage “correlation does not equal causation”. Technically, I suppose it should be called “spurious interpretations” since the correlations themselves are quite real, but then good marketing is everything. There is, however, a more formal definition of … Continue reading The real meaning of spurious correlations

Taking steps (in XML)

written 24 days ago by What You're Doing Is Rather Desperate by Neil Saunders

So the votes are in: Your established blog is mostly about your work. Your work changes. Do you continue at the current blog or start a new one? — Neil Saunders (@neilfws) January 23, 2017 I thank you, kind readers. So here’s the plan: (1) keep blogging here as frequently as possible (perhaps monthly), (2) … Continue reading Taking steps (in XML)

Staying Current in Bioinformatics &amp; Genomics: 2017 Edition

written 24 days ago by Getting Genetics Done by Stephen Turner

A while back I wrote this post about how I stay current in bioinformatics &amp; genomics. That was nearly five years ago. A lot has changed since then. A few links are dead. Some of the blogs or Twitter accounts I mentioned have shifted focus or haven’t been updated in years (guilty as charged). The way we consume media has evolved — Google thought they could kill off RSS (long live RSS!), there are many new literature alert services, preprints have really taken off in this field, and many more scientists are engaging via social media than before.People still frequently ask me how I stay current and keep a finger on the pulse of the field. I’m not claiming to be able to do this well — that’s a near-impossible task for anyone. Five years later and I still run our bioinformatics core, and I’m still mostly focused on applied methodology and study design rather than any particular phenotype, model system, disease, or specific method. It helps me to know that transcript-level estimates improve gene-level inferences from RNA-seq data, and that there’s software to help me do this, but the details underlying kmer shredding vs pseudoalignment to a transcriptome de Bruijn graph aren’t as important to me as knowing that there’s a software implementation that’s well documented, actively supported, and performs well in fair benchmarks. As such, most of what I pay attention to is applied/methods-focused.What follows is a scattershot, noncomprensive guide to the people, blogs, news outlets, journals, and ...

How to enable the scroll mode for tmux

written 24 days ago by Diving into Genetics and Genomics

tmux config fileyou can copy to your home changed the key binding from control + b to control + a if you are familiar with the screen shortcuts.control + a + c will create a new window. control + a + Space will move to previous window. control + a + n will move to next window.control + a + ?will show you all the shortcuts.scroll modeOne problem with screen or tmux is that you have to press control + a + [ to enter the copy mode, and and control + a + ] to paste it. I want to just use the mouse to scroll up and down and copy/ this long thread github issue: The solution that worked for me:download the Tmux Plugin Managergit clone ~/.tmux/plugins/tpmPut this at the bottom of .tmux.conf:# List of pluginsset -g @plugin 'tmux-plugins/tpm'set -g @plugin 'tmux-plugins/tmux-sensible'# Other examples:# set -g @plugin 'github_username/plugin_name'# set -g @plugin ''# set -g @plugin ''# Initialize TMUX plugin manager (keep this line at the very bottom of tmux.conf)run '~/.tmux/plugins/tpm/tpm'Now install tmux-better-mouse-mode your .tmux.conf.To enable mouse-mode in tmux 2.1+, put the following line in your ~/.tmux.conf:set-option -g mouse onthen add the following line to your .tmux.conf file:set -g @plugin 'nhdaly/tmux-better-mouse-mode'install it# start a new sessiontmux# install plugin`control + a + I (captial)` to install all the plugins.Now if you scroll up with your mouse, you will enter into copy mode automatically, and when you scroll down to the end of the current screen, ...

On The International Nature of American Biotech

written 25 days ago by Omics! Omics! by Keith Robinson

I'll spend two hours in project meetings tomorrow. Around the table will be a group of scientists who are all at the top of the game and among the best in the world at what they do. We will be trying to push forward new antibiotics to save lives. Yes, we are also trying to be rewarded monetarily with it, but we all share a mission to improve humanity by finding new drugs for important medical needs.Read more »
<prev • 2,051 results • page 1 of 83 • next >

Planet Feeds

The OpenHelix Blog
A news portal with postings about genomics resources, genomics news and research, science and more.
360 posts, last updated 1 day ago
Diving into Genetics and Genomics
A wet lab biologist' bioinformatic notes. Mostly is about Linux, R, python, reproducible research, open science and NGS. I am into data science! I am working on glioblastoma (a terrible brain cancer) genomics at MD Anderson cancer center. Disclaimer: For posts that I copied from other places, credits go to the original authors.
31 posts, last updated 1 day ago
Living in an Ivory Basement by Titus Brown
bioinformatics education, metagenomics assembly, python programming
188 posts, last updated 6 days ago
Omics! Omics! by Keith Robinson
A computational biologist's personal views on new technologies & publications on genomics & proteomics and their impact on drug discovery
177 posts, last updated 7 days ago
Ensembl Blog
News about the Ensembl Project and its genome browser
104 posts, last updated 8 days ago
What You're Doing Is Rather Desperate by Neil Saunders
Notes from the life of a computational biologist
81 posts, last updated 9 days ago
Bits of DNA by Lior Pachter
Reviews and commentary on computational biology
74 posts, last updated 14 days ago
Kevin's GATTACA World
Weblog on Bioinformatics, Genome Science and Next Generation Sequencing
61 posts, last updated 18 days ago
Getting Genetics Done by Stephen Turner
Getting Things Done in Genetics & Bioinformatics Research
60 posts, last updated 24 days ago
MassGenomics by Dan Koboldt
Medical genomics in the post-genome era
58 posts, last updated 29 days ago
The Grand Locus
My name is Guillaume Filion. I am a scientist who loves biology and mathematics. As of late I also got into computers and the Internet. I intend my blog to be recreational, and not academic nor educational. I wish you will find some of the posts inspiring for your own reflection.
24 posts, last updated 29 days ago
Next Gen Seek
Making Sense of Next-Gen Sequencing Data
79 posts, last updated 4 weeks ago
YOKOFAKUN by Pierre Lindenbaum
virology, bioinformatics, genetics, science, java
65 posts, last updated 5 weeks ago
Opinionomics by Mick Watson
bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"
38 posts, last updated 6 weeks ago
Bioinformatician at large by Ewan Birney
Thoughts and opinions from the associate director of the EMBL-European Bioinformatics Institute
57 posts, last updated 9 weeks ago
thoughts about ...
My worklog on bioinformatics, science and research. Small tasks and cute tricks included :)
32 posts, last updated 9 weeks ago
This blog is written by Giovanni M. Dall’Olio, a research associate at the Cancer Evolutionary Genomics‘s group of Francesca Ciccarelli at the King’s College of London. My primary interests are in the system biology of cancer and in identifying new potential drug targets for this disease.
12 posts, last updated 11 weeks ago
Inside UniProt
News and commentary from the UniProt developers
26 posts, last updated 3 months ago
In between lines of code by Lex Nederbragt
Biology, sequencing, bioinformatics and more
29 posts, last updated 4 months ago
The Genome Factory
Bioinformatics tips, tricks, tools and commentary - all with a microbiological NGS bent. Authored by Dr Torsten Seemann from Melbourne, Australia.
29 posts, last updated 5 months ago
The Genome Factory
Bioinformatics tips, tricks, tools and commentary with a microbial genomics bent. Written by Torsten Seemann from Melbourne, Australia.
25 posts, last updated 5 months ago
Bioinformatics I/O
Tips && tricks from a cluster of bioinformaticians
10 posts, last updated 6 months ago
Bits of Bioinformatics by Páll Melsted
Assistant professor of computer science at University of Iceland.
8 posts, last updated 8 months ago - Bioinformatics by Manoj Samanta
Frontier in Bioinformatics
185 posts, last updated 9 months ago
Blue Collar Bioinformatics by Brad Chapman
bioinformatics, biopython, genomic analysis
38 posts, last updated 10 months ago
opiniomics by Mick Watson
bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"
78 posts, last updated 13 months ago
Bergman Lab
21 posts, last updated 2.2 years ago
Genomes Unzipped
A group blog providing expert, independent commentary on the personal genomics industry.
34 posts, last updated 2.3 years ago
miRBase blog
miRBase news and views
14 posts, last updated 2.6 years ago
Bio and Geo Informatics by Brent Pedersen
Genomics Programming
25 posts, last updated 3.2 years ago
Jermdemo Raised to the Law by Jeremy Leipzig
Mostly bioinformatics, NGS, and cat litter box reviews
25 posts, last updated 3.4 years ago


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 620 users visited in the last hour