Question: Why Should I Use Galaxy ?
27
gravatar for Pierre Lindenbaum
5.2 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum99k wrote:

Don't get me wrong, I'm strongly convinced that Galaxy is a powerful tool to manipulate data interactively.

But as a trained bioinformatician who have a deep knowledge of the linux command line, do you consider that it is only a tool for the "biologist/end-user or do you actually use galaxy for your every day data-manipulations (why?).

We're about to install galaxy here, and some colleagues believe that it will be used from 'A' to 'Z' to make our NGS analysis. Why not ? But why should I spend some time in front of my web browser when I can write a shell-script (and save it into git).

Pierre

UPDATE 2017-04

galaxy workflow • 11k views
ADD COMMENTlink modified 5 months ago • written 5.2 years ago by Pierre Lindenbaum99k
4

People always ask me if I use galaxy. I cannot see a reason that a bioinformatics practitioner needs Galaxy. I would love to know how I am wrong.

ADD REPLYlink written 5.2 years ago by Zev.Kronenberg11k
2

I agree with Zev. We have a Galaxy installation, but I cannot really think of anyone who is actually using it. Last time I checked there was still no versioning support, which I think is quite important when creating pipelines/doing analysis. Maintaining your workflows within Galaxy is also tricky, whereas archiving scripts in a repository seems almost natural.

ADD REPLYlink written 5.2 years ago by Joachim2.8k
4

I've been maintaining a local galaxy-dist installation for our lab, and while I agree with all the points about reproducible and sensible workflows, the one thing that really bothers me about Galaxy is that every input and output is a file. There is really no way to process data as a stream - e.g. piping the output of one process into the next part of the pipeline. For big data processing this creates unnecessary delays and large datasets that must be cleaned up periodically.

ADD REPLYlink written 5.2 years ago by Matt Shirley7.9k
2

Reproducibility does not come without cost. As many others, I also use Galaxy for teaching. Students benchmark different segments of Galaxy and learn how to design an efficient processing pipeline for large data from a scratch. The course is called "Architecture of large bioinformatics systems" but maybe I rename it to "Tradeoffs of reproducibility".

ADD REPLYlink written 5.2 years ago by Pawel Szczesny3.2k

Bonjour Pierre, I want detect lncRNA from some RNA-seq data (fastq format), as you know I should do several step: align reads, provide a GTF file, annotation,diff-exp. how Galaxy can help me?please give me some info because I am new in this web-tool.

because I don`t anything about Linux,Galexy is good for me.

Merci de votre aide

ADD REPLYlink modified 14 months ago • written 14 months ago by Edalat20
19
gravatar for Alex Paciorkowski
5.2 years ago by
Rochester, NY USA
Alex Paciorkowski3.3k wrote:

Our group has implemented Galaxy for two reasons:

1) We are hoping to improve the documentation of our workflows, so that it is clear who ran what script, what version of what script, and what the output was. This (we hope) should improve reproducibility and we are anticipating it will help to address reviewer comments that bemoan the "data was analyzed using local scripts" sentence many of us may have written into our methods. This is probably the main reason a bioinformatics group might begin using Galaxy. Of course, there are other solutions to this -- but Galaxy is just one of those solutions.

2) Allows non-computational investigators in the group to begin learning NGS data analysis without first taking 4-6 months to learn Linux at the command line. But not all groups will have this need -- we happen to have this need. So, it will probably be group dependent.

And a third reason, I have to add, the Galaxy development team are an excellent group of professionals to work with.

ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by Alex Paciorkowski3.3k

Good point on documentation.

ADD REPLYlink written 5.2 years ago by Zev.Kronenberg11k
11
gravatar for Pawel Szczesny
5.2 years ago by
Pawel Szczesny3.2k
Poland
Pawel Szczesny3.2k wrote:

Do you remember "Works on my machine" certification program [1]? Galaxy provides an environment for reproducibility of a workflow. If I take your shell script, I need to make sure I have the same libraries, environmental variables, paths, etc. But I can take your Galaxy workflow, assuming you have a reasonably standard Galaxy instance, and expect it to run it on my data without thinking about compatibility.

What about an attempt to reproduce certain workflow from 10 years old paper? Even if you have the source code, it's almost 100% chance it is not going to work. Galaxy is a substitute of a virtual machine.

[1] http://www.codinghorror.com/blog/2007/03/the-works-on-my-machine-certification-program.html

ADD COMMENTlink written 5.2 years ago by Pawel Szczesny3.2k
2

This is an honest question: What do you think the chances are that a Galaxy pipeline from today will work again in 10 years? Who knows what the computing (desktop/laptop) world will look like in 10 years?

ADD REPLYlink written 4.0 years ago by Eric Normandeau9.6k
10
gravatar for Chris Evelo
5.2 years ago by
Chris Evelo9.9k
Maastricht, The Netherlands
Chris Evelo9.9k wrote:

I think the main reason for a power user like yourself to use it would be that you could actually contribute to Galaxy itself. In that way your power tools could come available to all these non power users and you could help to clean up that overfilled tools shed by removing things you see in there that are not as good as other things. (according to ISMB talks there literally are thousand of tools in that shed).

ADD COMMENTlink written 5.2 years ago by Chris Evelo9.9k
6
gravatar for Chris Miller
5.2 years ago by
Chris Miller18k
Washington University in St. Louis, MO
Chris Miller18k wrote:

Command-line savvy power users are not its target audience, and I don't see any reason why you and I should be using it. Just like most GUIs, it fills an important niche for non-computational types, but is limiting to a power user.

ADD COMMENTlink written 5.2 years ago by Chris Miller18k
6
gravatar for Manu Prestat
5.2 years ago by
Manu Prestat3.8k
Marseille, France
Manu Prestat3.8k wrote:

In spite of my CLI knowledge, I use and appreciate galaxy (the JGI implementation) mainly for the workflows. As a more general thought, I really do think that being a bash and command-line expert is a real advantage as a bioinformaticist, however when something good happens in the GUI world I won't throw it away. For instance, who does still use pine or lynx? ;-)

ADD COMMENTlink written 5.2 years ago by Manu Prestat3.8k

Lynx ? Real hackers use curl ;-)

ADD REPLYlink written 5.2 years ago by Pierre Lindenbaum99k
4

lynx predates curl by ~5 years :-)

ADD REPLYlink written 5.2 years ago by asjo120

Lynx was my browser of choice (well there really wasn't a choice) for years

ADD REPLYlink written 5.2 years ago by Mndoci1.2k
4
gravatar for Casey Bergman
5.2 years ago by
Casey Bergman17k
Athens, GA, USA
Casey Bergman17k wrote:

For me the main reason to #usegalaxy is in terms of training students and dealing with collaborators. First, galaxy allows you to teach bioinformatics separately from computing (e.g. UNIX, programming). Second, Galaxy allows you to reduce the time consuming back-and-forth of communicating methods/results between you and students/collaborators. I estimate that somewhere upwards of 60% of a project can be simply communicating methods and results - sharing histories reduces this drastically. Third, by providing constraints on what a user can do, it prevents a naive user from tearing down a machine/cluster, so you spend less time f-ing around on system administration with a newbie, which puts many of them off from being self-sufficient. Fourth, it establishes a best practice environment for learning bioinformatics - eg. Galaxy allows you to teach concepts of workflows easily & code organization/testing/documentation when developing local tools. Finally, by having a new users start with Galaxy and learn its limits for themselves, they quickly become motivated to take their training wheels off and learn how ride for themselves. In sum, Galaxy is worth adopting for you to be able to get non-trained biologists to be able to help themselves, which gives you more time to focus on the real science, rather than being a bioinfortechnician. As I say to many people: "Use galaxy, it's out of this world"

ADD COMMENTlink written 5.2 years ago by Casey Bergman17k
1
gravatar for Alastair Kerr
4.0 years ago by
Alastair Kerr5.2k
The University of Edinburgh, UK
Alastair Kerr5.2k wrote:

Other points on this thread are key but I'll add my experience as a core facility manager. Apart from the obvious advantage of allowing inexperienced computer users a GUI environment, an insta-gui for new command-line scripts and programs (that automatically plays nice with other programs) enables rapid prototyping. Thus the user can play with options that allows them to import results easily into other programs and workflows which may even be set up by the core beforehand.

Also the library feature is useful, allowing shared data to selected groups.

ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by Alastair Kerr5.2k
0
gravatar for mikhail.shugay
4.0 years ago by
mikhail.shugay3.2k
Czech Republic, Brno, CEITEC
mikhail.shugay3.2k wrote:

For me Galaxy is mainly used to do some manual jobs like intersect my regions of interest with genome tracks from UCSC. In bioinformatics you really need to control your data by manually looking into it time-to-time, thats why GUI tools are useful. I do this during downstream functional analysis, and I believe it is the easiest way in most cases. However, I do not use it to pipe data from raw reads to filtering/mapping and so on (despite there is a nice workflow system implemented in Galaxy), as I feel running shell scripts is more stable.

ADD COMMENTlink written 4.0 years ago by mikhail.shugay3.2k
0
gravatar for Tonyzeng
4.0 years ago by
Tonyzeng250
Tonyzeng250 wrote:

I do get benefits by using Galaxy by doing some training with out any bioinformatics experience and computer basic skills at all, especially when I could not get access to the high powerful computer but my personal computer. At least I got a general whole picture of pipeline in my head from beginning like reads quality control until SNP calling. '

Now, since I get access to powerful computer and know some operations of Linux and sequence softwares, pre-trainning experience by Galaxy really give me some ideas at least that I know how to do for the next steps using some protocols provided by many guys from sequence communities and make me clear that what is the wrong or right of the out put files like...

ADD COMMENTlink written 4.0 years ago by Tonyzeng250
0
gravatar for ropolocan
7 months ago by
ropolocan270
Canada
ropolocan270 wrote:

In my experience with some instances of Galaxy, often while using it you will find interesting tools that you want to adopt for your command line workflows. In those cases, then you can go to the toolshed and clone the repository with mercurial. The tool in question might be a script that has been wrapped for usage in Galaxy, in which case it the repository will also have an XML wrapper file. Hopefully, the script is well documented so that you can use it locally.

ADD COMMENTlink written 7 months ago by ropolocan270
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1589 users visited in the last hour