Today's cover story is one of special interest for us scientists trained to look for answers encoded in the genome. Obi Griffith and Malachi Griffith were born at the same time to the same mother and from the very beginning were considered fraternal (dizygotic, non-identical) twins. They just appeared to be too different. It was only later, much later, when they found out the truth. In Obi's words:
"It is true that we only discovered we were identical twins about three years ago while I was a postdoc at Lawrence Berkeley Lab and Malachi was a postdoc at The Genome Institute. We both signed up for 23andMe hoping to discuss differences and then found out there weren't any! We thought there must be some mistake and ordered an independent zygosity test to confirm. Ironically we did a project on the rate of zygosity misdiagnosis in undergrad but still never guessed that we were identical (monozygotic). We had always been told we were fraternal (dizygotic) and that we are "so different". As a twin, when you say you are fraternal, people immediately start looking for the differences. If the hospital had gotten it correct we probably would have spent our whole lives being told "you are so identical". haha. Really interesting to think what psychological effects that would have had"
Both Obi and Malachi have picked the same career path and today day both are working at the Genome Institute at Washington University School of Medicine one of the powerhouses for genomic research in the United States. They pursue active research relating to the application of bioinformatics to clinical genomics and (beside their many other accomplishments) are the creators of the DGIdb : The Database of Druggable Genome (Nature 2013).
Both of them are early adopters and active contributors to Biostars under the user names of Obi Griffith and Malachi Griffith, they have authored of some of the most viewed and accessed content on the site.
Obi and Malachi Griffith of DGIdb
What hardware do you use?
Obi: I work on a macbook and android nexus phone and android samsung tablet while mobile. At home I use an iMac for web development, have Dell ubuntu boxes for web serving and testing, and kindle for reading. I even have a windows 7 box for legacy projects, consulting (and skyrim - shhh). At work I use a mac pro and dell ubuntu box. Code development is done equally on mac/ubuntu. Most analysis tasks are run on a large cluster. I don't like to commit to any one hardware or OS provider.
Malachi: Mac Laptop Pro, MacPro Desktop Server, a model Dell Ubuntu linux workstation, a high performance cluster of Ubuntu blades for all heavy computing (~4000 cores).
What is your text editor?
Obi: Vi/Vim - mainly because just about anywhere you log into, there it is.
Malachi: vi (vim)
What software do you use for your work?
Obi: Unfortunately a lot my time is now spent reading email in thunderbird/gmail or writing papers/grants in word or googledocs. That makes me sound really old. But, when I find time to write code it primarily occurs in R/Bioconductor for analysis, Perl for pipeline development as part of our infrastructure (the Genome Modeling System) and some Ruby/Rails for web tool development. Analysis involves running many tools (mostly linux commandline) such as aligners (BWA, BowTie/TopHat), variant callers (Strelka, VarScan, SomaticSniper, MuTect, Pindel, etc), expression analysis (Cufflinks, HtSeq/edgeR), fusion detection (ChimeraScan, Integrate), sequence utilities (bedtools, samtools, Picard), browsers (IGV, UCSC, Ensembl), R/Bioconductor packages (Survival, RandomForests, bioMart, etc) and more.
Malachi: Genome Modelling System, BWA, BWA-mem, Picard, GATK, bedtools, sambamba, samtools, joinx, bamreadcount, SomaticSniper, Varscan2, Strelka, Mutect, Pindel, Breakdancer, lumpy-sv, cn.mops.
What do you use to create plots and charts?
Obi: Almost exclusively R/Bioconductor packages and a lot of ggplot.
Malachi: R basic plotting, R ggplot2
What do you consider the best language to do bioinformatics with?
Obi: This really depends on what you are trying to do. For analysis and applied bioinformatics its whatever language has the best existing libraries for your research question or data type. For me that happens to be Perl/R. I don't think they are amazingly designed or documented. But, I think programming language snobbery is silly. Whatever gets the job done.
Malachi: What ever helps you accomplish your goals. The most suitable language will arguably vary according to the task at hand. Though pursuing this arguments is a low value activity. Our organization uses C/C++ to develop or modify certain tools, java for others. We use Ruby and rails for web applications. Some analysts use Python, Bash, Awk, Perl, etc. The 'glue' that holds our pipelines together is Perl. A lot of scripting is also done with Perl. If we were starting over today, we might choose Python for this purpose.
What bioinformatics tools/software do not get enough recognition?
Obi: So many. I want to say ... all of them. To give just a couple examples.
- BioGPS is an incredibly useful web resource for exploring a gene or gene list and not as widely recognized as it should be. This came from the same lab who also spearheaded the stubbing in of genes into wikipedia. Do we ever take that for granted!
- Where would we be without samtools and bedtools for basic sam/bed operations?
- So many R/Bioconductor packages.
- The Tuxedo suite makes RNAseq analysis accessible to even relatively inexpert analysts and sets a reliability/development standard that few other bioinformatics tools can match.
- Finally, BWA is so widely used and the foundation of so many genomic studies. And, yet its latest iteration (BWA-MEM) was famously rejected by the journal Bioinformatics. This should have been a Nature Methods paper. The most derivative tripe on the wet lab side sometimes gets published in Science, Nature and Cell. And, yet a solidly written, supported, documented, and well-performing piece of software that is provided free and open-source and practically forms the cornerstone of genomics has a hard time finding a home. I don't want to say it is as important as PCR or anything. But come on! We need to start recognizing these efforts at a level appropriate to their importance for the community. It drives me crazy.
Malachi: joinx, strelka
See all post in this series https://www.biostars.org/t/uses-this/
To be notified of new post in the series follow the first post: Jim Robinson of the Integrative Genomics Viewer (IGV) uses this