Question

What Kind Of Bioinformatics Tutorials Would You Like To See Online?

16

Entering edit mode

13.5 years ago

User 59 13k

This is a two-part question, so bear with me!

I work on Knowledgeblog which is a lightweight publication system for scientific code, data, and results based around WordPress and extended by an ecosystem of off-the-shelf and custom plugins.

We're currently putting together a 'writeathon' to provide some bioinformatics tutorial material on a Knowledgeblog. What topics do people think would be good to cover?

We're looking for tutorials that might be good for all levels - computer scientists interested in learning some biology, biologists getting interested in bioinformatics, and of course tutorials aimed at bioinformaticians by bioinformaticians.

The second part of the question is more of a call to arms. We have a travel budget, and would be happy to spend some of this encouraging people to come to Newcastle for a day (Tuesday 21st June) to write away with us. Obviously this is more likely to occur if you're in the UK, but close international travel could also be supported in a limited number of cases.

All tutorials will be given a citable DOI, and no promises, but we will go for PubMed inclusion if we get enough content. You could also contribute remotely on the day, should travel be impossible but you still want to get some content up!

Suggestions for tutorial topics under this question would be great, votes will allow us to work out what topics we cover and who we invite! If you're interested in joining us in Newcastle at the end of June then please drop me an email directly (d.c.swan@ncl.ac.uk).

For examples of existing Knowledgeblogs you can have a look at Ontogenesis and Taverna kblogs.

education • 9.2k views

ADD COMMENT • link updated 20 months ago by Ram 44k • written 13.5 years ago by User 59 13k

2

Entering edit mode

community wiki ?

ADD REPLY • link 13.5 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

Will authors be able to edit the tutorials after the review?

ADD REPLY • link 13.5 years ago by Jan Kosinski ★ 1.6k

0

Entering edit mode

Good initiative and best of luck! To add to Jan's question: will authors be able to edit tutorials that they have not written themselves? This is vital IMHO.

ADD REPLY • link 13.5 years ago by Michael Schubert ★ 7.1k

0

Entering edit mode

Jan, very good question - the question of whether an article is canonical is important. The way we work this right now is that if new versions are edited, the old versions remain on the site, linked to at the bottom of the article.

ADD REPLY • link 13.5 years ago by User 59 13k

0

Entering edit mode

Michael, it doesn't work so much as a wiki. Articles can of course have multiple authors, but I don't think we envisage people changing other peoples articles! The idea would be to have more of a post-publication review - in the comments, or via trackbacks/pingbacks to other blog discussions, that the author could address at some point.

ADD REPLY • link 13.5 years ago by User 59 13k

0

Entering edit mode

Good luck with this Daniel. Are all images and text under a creative commons (or similar) licence? It would be nice to be able use material from the tutorials in both workshops and seminars without breaking copyright. On a related note, do you have recommended image resolution for the wiki or should the images link to a higher resolution version? This would be idea for their inclusion in other seminars.

ADD REPLY • link 13.5 years ago by Alastair Kerr 5.3k

0

Entering edit mode

Alastair, good point, I think we all feel an appropriate CC licence should be in place for this, but there is no decision on this yet. I guess the image resolution depends on how you author the tutorial. If they're embedded in a Word document and then posted, I suspect they would remain at 'Word' resolution. If you were to edit the post in the WordPress interface, you would be able to exercise more control over the formatting. We would support both endeavours, but the idea of Knowledgeblog was to allow people to post articles to the system using whatever their current toolchain is

ADD REPLY • link 13.5 years ago by User 59 13k

0

Entering edit mode

Regarding whether it should be a wiki, definitely it should not! I might want to publish a tutorial using for solving a problem X using a tool Y, I don't want others editing it to use a tool Z because the community believes a tool Z is better. They should write their own tutorial on using a tool Z.

ADD REPLY • link 13.5 years ago by Jan Kosinski ★ 1.6k

0

Entering edit mode

Jan, this is what we envisage as well. Wiki's are great, but not for what we're trying to do :)

ADD REPLY • link 13.5 years ago by User 59 13k

score 13 · Answer 1 · 2011-04-19

13

Entering edit mode

13.5 years ago

Khader Shameer 18k

Excellent effort Daniel ! Best wishes in advance.

I would start with a section on Statistics followed by in-depth tutorial. Statistical concepts will be reference material for various sections in the tutorial section

I think it will be interesting to see the tutorials organized by biological data / experiments.

For example:

Genome sequence:

Sequence similarity search
NGS/WES (QC, alignment, variant calling, annotation)
Phylogeny

Gene expression:

Mining public data resources for expression data pertaining to specific cellular events
Analysis of gene expression data using BioConductor packages

GWAS:

Background on Statistical Genetics
PLINK
DbGAP
Visualization tools

Protein sequence:

Homology
Domain/Motif assignment
Analysis of unassigned regions
Sequence classification (family, super family, fold level)

Protein Structure:

Modeling
Structure analysis (Hydrogen bond, solvent accessibility, disulphide bonds, higher order interactions)
Structure classification
Quality assessment of protein structures

Protein-protein interaction:

Databases
Visualization of PPI (Cytoscape, BioLayout Express 3D etc)
Reasoning over the data

Others:

Machine learning (Discuss various aspect of soft computing algorithms using published datasets)
Data integration and Data mining topics

ADD COMMENT • link 13.5 years ago by Khader Shameer 18k

2

Entering edit mode

Looks like a fabulous beginning for an advanced course in bioinformatics!

ADD REPLY • link 13.5 years ago by Larry_Parnell 16k

0

Entering edit mode

Thanks Larry. Do you think we could really organize such a course that transcend between genome and proteome ? EMBO is doing great job by providing grants for teaching, is there anything similar in US ?

ADD REPLY • link 13.5 years ago by Khader Shameer 18k

0

Entering edit mode

Thanks Khader, some good suggestions there and at least some areas we have some expertise in that we could leverage locally.

ADD REPLY • link 13.5 years ago by User 59 13k

0

Entering edit mode

Thanks Daniel. Please let me know if I can contribute one or two tutorials. I will be happy to be a part of it !

ADD REPLY • link 13.5 years ago by Khader Shameer 18k

score 9 · Answer 2 · 2011-04-19

9

Entering edit mode

13.5 years ago

Dave Clements ▴ 610

A few approaches to consider:

For software installation/configuration tutorials, I recommend the approach used in the GMOD Tutorials. These include starting virtual system images (these use VMware), sample data, and step by step instructions. Most of these came out the annual GMOD courses and reflect exactly what was covered in the course. One drawback of having a starting system image is that those images get stale and need to be refreshed periodically (at GMOD this happens once a year). The instructors create these tutorials in this format for the course.
For using software, short video tutorials work very well. The Galaxy Project puts out wildly popular quickies, video tutorials that highlight how to do specific tasks in Galaxy. These only require a few minutes from the user (but take a long time to make).
Finally, I also like the OpenHelix approach. OpenHelix creates comprehensive hour long video and slide based tutorials that include worked examples. These take an enormous amount of time to make, but excel at being thorough and clear.

ADD COMMENT • link 13.5 years ago by Dave Clements ▴ 610

1

Entering edit mode

i have a lot of respect for GMOD but I feel like providing ready-to-use virtual instances leaves beginners helpless when they will inevitably need to install dependencies and muck with their PATH to get something working. This is something I've seen first hand.

ADD REPLY • link 13.5 years ago by Jeremy Leipzig 22k

0

Entering edit mode

openhelix is a great resource. It's just a shame not all of the tutorials are free :( The galaxy webcasts are also excellent

ADD REPLY • link 13.5 years ago by Pi ▴ 520

0

Entering edit mode

Dave, We've used VM's for tutorials before for our Master's course, so not an alien idea to us. I think the idea of more screencast style tutorials is something we had not necessarily considered but perhaps should.

ADD REPLY • link 13.5 years ago by User 59 13k

0

Entering edit mode

At Ensembl we also have quite some short video tutorials, focusing on specific tasks in Ensembl and BioMart. These are made using Camtasia (http://en.wikipedia.org/wiki/Camtasia_Studio). They are made available through YouTube (http://www.ensembl.org/info/website/tutorials/index.html). They seem to be rather popular, but take quite a lot of time to make ....

ADD REPLY • link 13.5 years ago by Bert Overduin ★ 3.7k

0

Entering edit mode

Jeremy, I agree that starting with ready-made virtual systems can leave users frustrated when they get outside the safety of that system. You can set "traps" in your teaching examples and then talk about things like checking logs, the screen command and so on, but that won't be comprehensive. I don't have a good idea on how to teach system debugging skills (in any depth) and bioinformatics tools in a short course.

ADD REPLY • link 13.5 years ago by Dave Clements ▴ 610

score 6 · Answer 3 · 2011-04-19

On a more advanced level I'd like to see:

- Multiple testing corrections 
- Getting started with medline text mining
- Building bioinformatics web apps backended by SQL
- Integrating multiple large data sets
- Bioinformatics projects: structure and lifecycle

[Edit] An additional one I thought of this morning was "databases in bioinformatics". In my experience, bioinformatics people use text files or SQL databases for data persistence and access, and not a lot else. A tutorial outlining the other options (berkeley DB, key-value stores, lucene, object serialization, object oriented databases, etc) with examples for each may give even experienced bioinformatics developers some new tools to work with.

score 5 · Answer 4 · 2011-04-19

5

Entering edit mode

13.5 years ago

Larry_Parnell 16k

My suggestion is not a topic but an approach. The tutorial certainly should be hands-on - there is no doubt about that - but it should go further and offer an interactive feature or critique/accolades from the tutorial leader or writer. A tutorial is about learning and bioinformatics is best taught in a more interactive style than by data dump/slide dump/read the notes on your own time.

ADD COMMENT • link 13.5 years ago by Larry_Parnell 16k

0

Entering edit mode

agreed, my preferred approach is a standard data set and a progressive series of analyses applied to it, each building on the previous.

ADD REPLY • link 13.5 years ago by Gareth Palidwor ★ 1.6k

0

Entering edit mode

Larry, you're right I think there's a lot of scope for critique in something like this which is often lacking from the format.

ADD REPLY • link 13.5 years ago by User 59 13k

Ram · Answer 5 · 2011-04-19

4

Entering edit mode

13.5 years ago

Pierre Lindenbaum 164k

my wishes :-)

how to write a plugin for Taverna2
how to "something-bio" using "language-1" when your favorite language is "language-2"
the internals of NCBI blast
biostatistics for dummies
...

ADD COMMENT • link 13.5 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

how to write a taverna plugin is in the 2x user manual but I can't point you to a link as the taverna web server is down for 2 days.

ADD REPLY • link 13.5 years ago by Pi ▴ 520

0

Entering edit mode

@pi , the documentation for T2 is, from my point of view, incomplete & unreadable.

ADD REPLY • link 13.5 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

Love the cross-language idea :)

ADD REPLY • link 13.5 years ago by User 59 13k

0

Entering edit mode

We've already got a knowledgeblog for taverna taverna.knowledgeblog.org). If anyone wants to write a "how-to write a plugin", this would be a good place to add it.

ADD REPLY • link 13.5 years ago by User 59 13k

0

Entering edit mode

We've already got a knowledgeblog for taverna taverna.knowledgeblog.org). If anyone wants to write a "how-to write a plugin", this would be a good place to add it.

ADD REPLY • link 13.5 years ago by phillord • 0

0

Entering edit mode

There is a tutorial on writing plugins for Taverna 2 here.

ADD REPLY • link updated 5.1 years ago by Ram 44k • written 13.5 years ago by Alaninmcr • 0

0

Entering edit mode

@alaninmcr , Thanks ! this tutorial looks far more complete than the last time I saw it. (I removed my previous comment about it)

ADD REPLY • link 13.5 years ago by Pierre Lindenbaum 164k

score 3 · Answer 6 · 2011-04-19

3

Entering edit mode

13.5 years ago

Gareth Palidwor ★ 1.6k

I prefer task oriented tutorials that use a standard data set to demonstrate a bunch of standard analyses. I do a lot of bioinformatics consulting for scientists and grad students and much of the work is just variations on the same tasks, for example:

Microarray data
- Quality analysis
- Normalization
- Annotation
- Fold change analysis
- Gene Ontology enrichment analysis
ChIP Seq
- Quality analysis
- Peak identification
- Peak annotation (association with genes)

Scripts in perl and R are helpful, but I've found TM4 MeV to be particularly useful for non programmers dealing with microarray data.

I've worked on a few tutorials similar to what you describe; the affymetrix one (http://www.stemcore.ca/projects/SCNcourse) is getting rather old (doesn't handle the exon/gene chips), and the ChIP Seq one (http://regulome.ca/2010workshop) should be updated as well.

ADD COMMENT • link 13.5 years ago by Gareth Palidwor ★ 1.6k

0

Entering edit mode

My background is array data, so that's definitely along the lines of the kind of tutorials I was going to try and get written myself.

The Chip-Seq work would be interesting, I've done a bit of of this recently, and the QA/PI stage would be of great interest.

ADD REPLY • link 13.5 years ago by User 59 13k

0

Entering edit mode

I always do a QA step first; not much point in proceeding with analysis of crappy data.

ADD REPLY • link 13.5 years ago by Gareth Palidwor ★ 1.6k

score 3 · Answer 7 · 2011-04-19

3

Entering edit mode

13.5 years ago

Genotepes ▴ 950

Most of the interesting things have been listed and covered;

Would add:

Annotations tools for GWAS results - database and visualisation scripts

Coalescent models

Haplotype and imputation analyses

This is more tutorial centered on problems to solve rather than focused on language or a database.

Christian

ADD COMMENT • link 13.5 years ago by Genotepes ▴ 950

0

Entering edit mode

GWAS is an interesting topic, but one we'd need someone to come in and do! Suggestions? Volunteers? :)

ADD REPLY • link 13.5 years ago by User 59 13k

0

Entering edit mode

I could write one or two things although there are researchers more experienced, more native english speakers (and more in UK and even in Newcastle - Heather Cordell if I remember).

But definitely on some issues around GWAs I can write short notes.

ADD REPLY • link 13.5 years ago by Genotepes ▴ 950

Ram · Answer 8 · 2011-04-20

3

Entering edit mode

13.5 years ago

Hranjeev ★ 1.5k

As for the tutorials, I'd like to see lots of existing papers reverse engineered with its sample datasets. So that we can walk through them step-by-step and know that we got them right. This is much like questions but only that the answers are worked out for you. Probably, much like a journal club only that it is online.

And, the tutorials also can have little pointers/links to other background reading materials which can be comprised of fundamental facts or structured review articles or something in line of that.

ADD COMMENT • link 13.5 years ago by Hranjeev ★ 1.5k

1

Entering edit mode

A great example for this is: Sémon, M., Lobry, J.R., Duret, L. (2006) No Evidence for Tissue-Specific Adaptation of Synonymous Codon Usage in Humans. Molecular Biology and Evolution, 23:523-529. which has online data sets with interactive we based R (!) so you can reproduce their analysis completely (http://pbil.univ-lyon1.fr/datasets/SemonLobryDuret2005/)

Jean Lobry does a lot of this sort of thing, check the "online reproducibility" links: http://pbil.univ-lyon1.fr/members/lobry/

ADD REPLY • link updated 5.1 years ago by Ram 44k • written 13.5 years ago by Gareth Palidwor ★ 1.6k

0

Entering edit mode

HRanjeev, this is something we've been thinking of doing with Knowledgeblog anyway. The thinking at the moment is more of an 'enhanced paper' where data and code is embedded into the article and can be 'read' by R, so that the work can be recapitulated on the fly and checked that what is published is indeed correct. Nice to see someone is in line with our thinking!

ADD REPLY • link 13.5 years ago by User 59 13k

0

Entering edit mode

Great thinking! Since we are in the internet age, glad that someone is actually considering to take it beyond traditional publishing mode. I'm actually excited to see how your concept flourishes. I'm following the academia.edu site also but I don't see it as an interactive avenue just yet. Sometimes the authors don't 'feel' the tangible credit to share their comments or even work on a public peer-review process. Hope this can different with Knowledgeblog. Good luck!

ADD REPLY • link 13.5 years ago by Hranjeev ★ 1.5k

0

Entering edit mode

That was an excellent resource gawp. Something new to me and Jean Lobry is really doing a good job there.

ADD REPLY • link 13.5 years ago by Hranjeev ★ 1.5k

score 2 · Answer 9 · 2011-04-19

I think Khader Shameer covered the spectrum fairly well. What I would like to see personally though is a primer on converting a command line-based pipeline into Galaxy. I'm becoming a fan of it but am personally having some issues with some of the advanced features and frankly can't find all the information I'd like to about the capabilities of Galaxy such as if the load balancing (Torque/PBS I believe) is customizable or if it does such a good job I wouldn't need to mark tasks as disk, RAM, or CPU intensive.

I believe there's a fair-sized market for this and believe it would ultimately render people's workflows more accessible. Not to mention strengthen the Galaxy framework as people add more tools and datatypes to it.

score 0 · Answer 10 · 2011-04-20

0

Entering edit mode

13.5 years ago

Nataly • 0

As a biologist doing genetics, you will make my day, Thx

ADD COMMENT • link 13.5 years ago by Nataly • 0

2

Entering edit mode

Nataly, try to use the "add comments" function under the Question for comments like these in the future...

ADD REPLY • link 13.5 years ago by Casey Bergman 18k