Question: What Kind Of Bioinformatics Tutorials Would You Like To See Online?
16
gravatar for Daniel Swan
7.9 years ago by
Daniel Swan13k
Aberdeen, UK
Daniel Swan13k wrote:

This is a two-part question, so bear with me!

I work on Knowledgeblog which is a lightweight publication system for scientific code, data, and results based around WordPress and extended by an ecosystem of off-the-shelf and custom plugins.

We're currently putting together a 'writeathon' to provide some bioinformatics tutorial material on a Knowledgeblog. What topics do people think would be good to cover?

We're looking for tutorials that might be good for all levels - computer scientists interested in learning some biology, biologists getting interested in bioinformatics, and of course tutorials aimed at bioinformaticians by bioinformaticians.

The second part of the question is more of a call to arms. We have a travel budget, and would be happy to spend some of this encouraging people to come to Newcastle for a day (Tuesday 21st June) to write away with us. Obviously this is more likely to occur if you're in the UK, but close international travel could also be supported in a limited number of cases.

All tutorials will be given a citable DOI, and no promises, but we will go for PubMed inclusion if we get enough content. You could also contribute remotely on the day, should travel be impossible but you still want to get some content up!

Suggestions for tutorial topics under this question would be great, votes will allow us to work out what topics we cover and who we invite! If you're interested in joining us in Newcastle at the end of June then please drop me an email directly (d.c.swan@ncl.ac.uk).

For examples of existing Knowledgeblogs you can have a look at Ontogenesis and Taverna kblogs.

tutorial education • 4.5k views
ADD COMMENTlink modified 7.9 years ago by Gareth Palidwor1.6k • written 7.9 years ago by Daniel Swan13k
2

community wiki ?

ADD REPLYlink written 7.9 years ago by Pierre Lindenbaum118k

Will authors be able to edit the tutorials after the review?

ADD REPLYlink written 7.9 years ago by Jan Kosinski1.6k

Good initiative and best of luck! To add to Jan's question: will authors be able to edit tutorials that they have not written themselves? This is vital IMHO.

ADD REPLYlink written 7.9 years ago by Michael Schubert6.9k

Jan, very good question - the question of whether an article is canonical is important. The way we work this right now is that if new versions are edited, the old versions remain on the site, linked to at the bottom of the article.

ADD REPLYlink written 7.9 years ago by Daniel Swan13k

Michael, it doesn't work so much as a wiki. Articles can of course have multiple authors, but I don't think we envisage people changing other peoples articles! The idea would be to have more of a post-publication review - in the comments, or via trackbacks/pingbacks to other blog discussions, that the author could address at some point.

ADD REPLYlink written 7.9 years ago by Daniel Swan13k

Good luck with this Daniel. Are all images and text under a creative commons (or similar) licence? It would be nice to be able use material from the tutorials in both workshops and seminars without breaking copyright. On a related note, do you have recommended image resolution for the wiki or should the images link to a higher resolution version? This would be idea for their inclusion in other seminars.

ADD REPLYlink written 7.9 years ago by Alastair Kerr5.2k

Alastair, good point, I think we all feel an appropriate CC licence should be in place for this, but there is no decision on this yet. I guess the image resolution depends on how you author the tutorial. If they're embedded in a Word document and then posted, I suspect they would remain at 'Word' resolution. If you were to edit the post in the WordPress interface, you would be able to exercise more control over the formatting. We would support both endeavours, but the idea of Knowledgeblog was to allow people to post articles to the system using whatever their current toolchain is

ADD REPLYlink written 7.9 years ago by Daniel Swan13k

Regarding whether it should be a wiki, definitely it should not! I might want to publish a tutorial using for solving a problem X using a tool Y, I don't want others editing it to use a tool Z because the community believes a tool Z is better. They should write their own tutorial on using a tool Z.

ADD REPLYlink written 7.9 years ago by Jan Kosinski1.6k

Jan, this is what we envisage as well. Wiki's are great, but not for what we're trying to do :)

ADD REPLYlink written 7.9 years ago by Daniel Swan13k
13
gravatar for Khader Shameer
7.9 years ago by
Manhattan, NY
Khader Shameer18k wrote:

Excellent effort Daniel ! Best wishes in advance.

I would start with a section on Statistics followed by in-depth tutorial. Statistical concepts will be reference material for various sections in the tutorial section

I think it will be interesting to see the tutorials organized by biological data / experiments.

For example:

Genome sequence:

  • Sequence similarity search
  • NGS/WES (QC, alignment, variant calling, annotation)
  • Phylogeny

Gene expression:

  • Mining public data resources for expression data pertaining to specific cellular events
  • Analysis of gene expression data using BioConductor packages

GWAS:

  • Background on Statistical Genetics
  • PLINK
  • DbGAP
  • Visualization tools

Protein sequence:

  • Homology
  • Domain/Motif assignment
  • Analysis of unassigned regions
  • Sequence classification (family, super family, fold level)

Protein Structure:

  • Modeling
  • Structure analysis (Hydrogen bond, solvent accessibility, disulphide bonds, higher order interactions)
  • Structure classification
  • Quality assessment of protein structures

Protein-protein interaction:

  • Databases
  • Visualization of PPI (Cytoscape, BioLayout Express 3D etc)
  • Reasoning over the data

Others:

  • Machine learning (Discuss various aspect of soft computing algorithms using published datasets)
  • Data integration and Data mining topics
ADD COMMENTlink written 7.9 years ago by Khader Shameer18k
2

Looks like a fabulous beginning for an advanced course in bioinformatics!

ADD REPLYlink written 7.9 years ago by Larry_Parnell16k

Thanks Larry. Do you think we could really organize such a course that transcend between genome and proteome ? EMBO is doing great job by providing grants for teaching, is there anything similar in US ?

ADD REPLYlink written 7.9 years ago by Khader Shameer18k

Thanks Khader, some good suggestions there and at least some areas we have some expertise in that we could leverage locally.

ADD REPLYlink written 7.9 years ago by Daniel Swan13k

Thanks Daniel. Please let me know if I can contribute one or two tutorials. I will be happy to be a part of it !

ADD REPLYlink written 7.9 years ago by Khader Shameer18k
9
gravatar for Dave Clements
7.9 years ago by
Dave Clements610
Dave Clements610 wrote:

A few approaches to consider:

  1. For software installation/configuration tutorials, I recommend the approach used in the GMOD Tutorials. These include starting virtual system images (these use VMware), sample data, and step by step instructions. Most of these came out the annual GMOD courses and reflect exactly what was covered in the course. One drawback of having a starting system image is that those images get stale and need to be refreshed periodically (at GMOD this happens once a year). The instructors create these tutorials in this format for the course.
  2. For using software, short video tutorials work very well. The Galaxy Project puts out wildly popular quickies, video tutorials that highlight how to do specific tasks in Galaxy. These only require a few minutes from the user (but take a long time to make).
  3. Finally, I also like the OpenHelix approach. OpenHelix creates comprehensive hour long video and slide based tutorials that include worked examples. These take an enormous amount of time to make, but excel at being thorough and clear.
ADD COMMENTlink written 7.9 years ago by Dave Clements610
1

i have a lot of respect for GMOD but I feel like providing ready-to-use virtual instances leaves beginners helpless when they will inevitably need to install dependencies and muck with their PATH to get something working. This is something I've seen first hand.

ADD REPLYlink written 7.9 years ago by Jeremy Leipzig18k

openhelix is a great resource. It's just a shame not all of the tutorials are free :( The galaxy webcasts are also excellent

ADD REPLYlink written 7.9 years ago by Pi510

Dave, We've used VM's for tutorials before for our Master's course, so not an alien idea to us. I think the idea of more screencast style tutorials is something we had not necessarily considered but perhaps should.

ADD REPLYlink written 7.9 years ago by Daniel Swan13k

At Ensembl we also have quite some short video tutorials, focusing on specific tasks in Ensembl and BioMart. These are made using Camtasia (http://en.wikipedia.org/wiki/Camtasia_Studio). They are made available through YouTube (http://www.ensembl.org/info/website/tutorials/index.html). They seem to be rather popular, but take quite a lot of time to make ....

ADD REPLYlink written 7.9 years ago by Bert Overduin3.6k

Jeremy, I agree that starting with ready-made virtual systems can leave users frustrated when they get outside the safety of that system. You can set "traps" in your teaching examples and then talk about things like checking logs, the screen command and so on, but that won't be comprehensive. I don't have a good idea on how to teach system debugging skills (in any depth) and bioinformatics tools in a short course.

ADD REPLYlink written 7.9 years ago by Dave Clements610
6
gravatar for Gareth Palidwor
7.9 years ago by
Gareth Palidwor1.6k
Ottawa
Gareth Palidwor1.6k wrote:

On a more advanced level I'd like to see:

- Multiple testing corrections 
- Getting started with medline text mining
- Building bioinformatics web apps backended by SQL
- Integrating multiple large data sets
- Bioinformatics projects: structure and lifecycle

[Edit] An additional one I thought of this morning was "databases in bioinformatics". In my experience, bioinformatics people use text files or SQL databases for data persistence and access, and not a lot else. A tutorial outlining the other options (berkeley DB, key-value stores, lucene, object serialization, object oriented databases, etc) with examples for each may give even experienced bioinformatics developers some new tools to work with.

ADD COMMENTlink modified 7.9 years ago • written 7.9 years ago by Gareth Palidwor1.6k

I'm pretty sure we're going to hit Integration as a topic anyway, but that's a good list. I might get one of our stats lecturers in to cover MTC, as I think it's a topic only ever mentioned 'in passing' with datasets!

ADD REPLYlink written 7.9 years ago by Daniel Swan13k
5
gravatar for Larry_Parnell
7.9 years ago by
Larry_Parnell16k
Boston, MA USA
Larry_Parnell16k wrote:

My suggestion is not a topic but an approach. The tutorial certainly should be hands-on - there is no doubt about that - but it should go further and offer an interactive feature or critique/accolades from the tutorial leader or writer. A tutorial is about learning and bioinformatics is best taught in a more interactive style than by data dump/slide dump/read the notes on your own time.

ADD COMMENTlink written 7.9 years ago by Larry_Parnell16k

agreed, my preferred approach is a standard data set and a progressive series of analyses applied to it, each building on the previous.

ADD REPLYlink written 7.9 years ago by Gareth Palidwor1.6k

Larry, you're right I think there's a lot of scope for critique in something like this which is often lacking from the format.

ADD REPLYlink written 7.9 years ago by Daniel Swan13k
4
gravatar for Pierre Lindenbaum
7.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum118k wrote:

my wishes :-)

  • how to write a plugin for Taverna2
  • how to "something-bio" using "language-1" when your favorite language is "language-2"
  • the internals of NCBI blast
  • biostatistics for dummies
  • ...
ADD COMMENTlink written 7.9 years ago by Pierre Lindenbaum118k

how to write a taverna plugin is in the 2x user manual but I can't point you to a link as the taverna web server is down for 2 days.

ADD REPLYlink written 7.9 years ago by Pi510

@pi , the documentation for T2 is, from my point of view, incomplete & unreadable.

ADD REPLYlink written 7.9 years ago by Pierre Lindenbaum118k

Love the cross-language idea :)

ADD REPLYlink written 7.9 years ago by Daniel Swan13k

We've already got a knowledgeblog for taverna taverna.knowledgeblog.org). If anyone wants to write a "how-to write a plugin", this would be a good place to add it.

ADD REPLYlink written 7.9 years ago by Daniel Swan13k

We've already got a knowledgeblog for taverna taverna.knowledgeblog.org). If anyone wants to write a "how-to write a plugin", this would be a good place to add it.

ADD REPLYlink written 7.9 years ago by phillord0

There is a tutorial on writing plugins for Taverna 2 at http://www.mygrid.org.uk/dev/wiki/display/developer/Creating+plugins+for+Taverna+2

ADD REPLYlink written 7.9 years ago by Alaninmcr0

@alaninmcr , Thanks ! this tutorial looks far more complete than the last time I saw it. (I removed my previous comment about it)

ADD REPLYlink written 7.9 years ago by Pierre Lindenbaum118k
3
gravatar for Gareth Palidwor
7.9 years ago by
Gareth Palidwor1.6k
Ottawa
Gareth Palidwor1.6k wrote:

I prefer task oriented tutorials that use a standard data set to demonstrate a bunch of standard analyses. I do a lot of bioinformatics consulting for scientists and grad students and much of the work is just variations on the same tasks, for example:

  • Microarray data

    • Quality analysis
    • Normalization
    • Annotation
    • Fold change analysis
    • Gene Ontology enrichment analysis
  • ChIP Seq

    • Quality analysis
    • Peak identification
    • Peak annotation (association with genes)

Scripts in perl and R are helpful, but I've found TM4 MeV to be particularly useful for non programmers dealing with microarray data.

I've worked on a few tutorials similar to what you describe; the affymetrix one (http://www.stemcore.ca/projects/SCNcourse) is getting rather old (doesn't handle the exon/gene chips), and the ChIP Seq one (http://regulome.ca/2010workshop) should be updated as well.

ADD COMMENTlink written 7.9 years ago by Gareth Palidwor1.6k

My background is array data, so that's definitely along the lines of the kind of tutorials I was going to try and get written myself.

The Chip-Seq work would be interesting, I've done a bit of of this recently, and the QA/PI stage would be of great interest.

ADD REPLYlink written 7.9 years ago by Daniel Swan13k

I always do a QA step first; not much point in proceeding with analysis of crappy data.

ADD REPLYlink written 7.9 years ago by Gareth Palidwor1.6k
3
gravatar for Genotepes
7.9 years ago by
Genotepes940
Nantes (France)
Genotepes940 wrote:

Most of the interesting things have been listed and covered;

Would add:

Annotations tools for GWAS results - database and visualisation scripts

Coalescent models

Haplotype and imputation analyses

This is more tutorial centered on problems to solve rather than focused on language or a database.

Christian

ADD COMMENTlink written 7.9 years ago by Genotepes940

GWAS is an interesting topic, but one we'd need someone to come in and do! Suggestions? Volunteers? :)

ADD REPLYlink written 7.9 years ago by Daniel Swan13k

I could write one or two things although there are researchers more experienced, more native english speakers (and more in UK and even in Newcastle - Heather Cordell if I remember).

But definitely on some issues around GWAs I can write short notes.

ADD REPLYlink written 7.9 years ago by Genotepes940
3
gravatar for Hranjeev
7.9 years ago by
Hranjeev1.5k
Malaysia
Hranjeev1.5k wrote:

As for the tutorials, I'd like to see lots of existing papers reverse engineered with its sample datasets. So that we can walk through them step-by-step and know that we got them right. This is much like questions but only that the answers are worked out for you. Probably, much like a journal club only that it is online.

And, the tutorials also can have little pointers/links to other background reading materials which can be comprised of fundamental facts or structured review articles or something in line of that.

ADD COMMENTlink modified 7.9 years ago • written 7.9 years ago by Hranjeev1.5k
1

A great example for this is: Sémon, M., Lobry, J.R., Duret, L. (2006) No Evidence for Tissue-Specific Adaptation of Synonymous Codon Usage in Humans. Molecular Biology and Evolution, 23:523-529. which has online data sets with interactive we based R (!) so you can reproduce their analysis completely (http://pbil.univ-lyon1.fr/datasets/SemonLobryDuret2005/)

Jean Lobry does a lot of this sort of thing, check the "online reproducibility" links: http://pbil.univ-lyon1.fr/members/lobry/

ADD REPLYlink written 7.9 years ago by Gareth Palidwor1.6k

HRanjeev, this is something we've been thinking of doing with Knowledgeblog anyway. The thinking at the moment is more of an 'enhanced paper' where data and code is embedded into the article and can be 'read' by R, so that the work can be recapitulated on the fly and checked that what is published is indeed correct. Nice to see someone is in line with our thinking!

ADD REPLYlink written 7.9 years ago by Daniel Swan13k

Great thinking! Since we are in the internet age, glad that someone is actually considering to take it beyond traditional publishing mode. I'm actually excited to see how your concept flourishes. I'm following the academia.edu site also but I don't see it as an interactive avenue just yet. Sometimes the authors don't 'feel' the tangible credit to share their comments or even work on a public peer-review process. Hope this can different with Knowledgeblog. Good luck!

ADD REPLYlink written 7.9 years ago by Hranjeev1.5k

That was an excellent resource gawp. Something new to me and Jean Lobry is really doing a good job there.

ADD REPLYlink written 7.9 years ago by Hranjeev1.5k
2
gravatar for Lythimus
7.9 years ago by
Lythimus200
Lythimus200 wrote:

I think Khader Shameer covered the spectrum fairly well. What I would like to see personally though is a primer on converting a command line-based pipeline into Galaxy. I'm becoming a fan of it but am personally having some issues with some of the advanced features and frankly can't find all the information I'd like to about the capabilities of Galaxy such as if the load balancing (Torque/PBS I believe) is customizable or if it does such a good job I wouldn't need to mark tasks as disk, RAM, or CPU intensive.

I believe there's a fair-sized market for this and believe it would ultimately render people's workflows more accessible. Not to mention strengthen the Galaxy framework as people add more tools and datatypes to it.

ADD COMMENTlink written 7.9 years ago by Lythimus200

Great idea - I was going through the Galaxy docs for this last week actually, and you're right I think this would have very broad appeal.

ADD REPLYlink written 7.9 years ago by Daniel Swan13k
0
gravatar for Nataly
7.9 years ago by
Nataly0
Nataly0 wrote:

As a biologist doing genetics, you will make my day, Thx

ADD COMMENTlink written 7.9 years ago by Nataly0
2

Nataly, try to use the "add comments" function under the Question for comments like these in the future...

ADD REPLYlink written 7.9 years ago by Casey Bergman18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1068 users visited in the last hour