Question: Building A Webpage For Accessing Rna-Seq Expression Data
6
gravatar for Arun
2.0 years ago by
Arun1.9k
Germany
Arun1.9k wrote:

Hello all,
I would like to get some hints to go about this task. I'd like to design a webpage/site that looks like this.
I have RNA-Seq data with replicates for WT and Mutants and also RPKM expression values. I also wrote an R-script that generates the shape and shades the expression values. Now the part I am not sure of is about the web-design itself, however I'd love to learn to do it. I know basic web-design per-se, just HTML with CSS. But I haven't used scripting and would love to make use of this opportunity to learn it.

Here are some of the questions I have:
1) Does this require java script and/or PHP/MySQL? If so, could you please elaborate it as well.
2) Is it possible to configure to run an R-script based on client-side parameters so that it could be executed on the fly on the cluster and the out pdf/image file returned for display?
3) Is it better to have the files ready and just query the image or to create the required image dynamically as explained in 2) ?
4) If I have left out anything or is unclear or you have other pointers, I'd love to hear them.

Thank you very much!
Arun

ADD COMMENTlink modified 2.0 years ago by Vitis1.3k • written 2.0 years ago by Arun1.9k
1

The source code for the eFP Browser is available on SourceForge, Arun. http://sourceforge.net/projects/efpbrowser/. Not sure if this helps you after 8 months!

ADD REPLYlink written 16 months ago by nicholas.provart10

Is this for a hobby project or are you planning to build a resource?

ADD REPLYlink written 2.0 years ago by Mndoci1.1k

It is supposed to be a resource ultimately (its not of absolute necessity to the people who hold the data at the moment). It is basically up to me and my capability as of now as they are not planning to realize it yet. So I have time to play around.

ADD REPLYlink written 2.0 years ago by Arun1.9k
1

If this is going to be a resource, I would recommend that in the long term you think about dedicated software resources (the resource could be yourself). A resource is a lot more than Javascript and you need to think about things like user authentication, data management, other policies as well.

ADD REPLYlink written 2.0 years ago by Mndoci1.1k
2
gravatar for Khader Shameer
2.0 years ago by
Rochester, MN
Khader Shameer14k wrote:

1) Does this require java script and/or PHP/MySQL? If so, could you please elaborate it as well.

Yes. Javascript to increase the usability of UI. PHP/MySQL for UI(front-end)-database(backend) interaction. Look at this discussion for various aspects of web programming for bioinformatics.

2) Is it possible to configure to run an R-script based on client-side parameters so that it could be executed on the fly on the cluster and the out pdf/image file returned for display?

Yes. This is possible, check this discussion here for different approaches to integrate R in a web application

3) Is it better to have the files ready and just query the image or to create the required image dynamically as explained in 2) ?

This depends on your requirement. You may have to talk to the biologist about the use of image and assess whether it is computationally worth to store them in advance or to create it as required.

4) If I have left out anything or is unclear or you have other pointers, I'd love to hear them.

It is not clear what type of data you are going to store in the database. You may look consult advanced database solutions like NoSQL that can handle scalability of RNA-seq data and modern web-application framework in your language of interest than creating an app from scratch.

ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by Khader Shameer14k

That's great! Thanks for providing me quite a bit of info to get started. I'll come back with more specific questions if I run into trouble. I'll wait a while before accepting this as an answer.

ADD REPLYlink written 2.0 years ago by Arun1.9k
2
gravatar for Damian Kao
2.0 years ago by
Damian Kao10k
UK
Damian Kao10k wrote:

3) Is it better to have the files ready and just query the image or to create the required image dynamically as explained in 2)?

  • I would go with client-side rendering if you are confident in your javascript skills. Have the back-end send data in JSON or XML. Use javascript to render your data in svg. I recommend looking at d3.js.
  • There are tons of tricks you can do to make your client-side rendering more efficient. Most basic one is to tile your tracks like google maps.

4) If I have left out anything or is unclear or you have other pointers, I'd love to hear them.

  • Learning client-side coding with javascript/html/css will help you a lot. It will give you an equal computational load between server-side and client-side and make your applications richer in functionality.
  • Writing efficient javascript is not as simple as you might think. I would read up on topics like closures/scoping/patterns in javascript. I personally think it's a very misunderstood language (also my favorite).
  • If you do use a lot of javascript for renderings, try not to use too many libraries (especially jquery). Most libraries are not optimized for displaying large amounts of data.
  • Go for simple. Don't make the user go through 3 pages full of menus and options before they see their data.
  • Do not over-engineer your backend. Do you really need mySQL? Think about what you want to present. Maybe go for a noSQL solution?
  • Think user-centric. This part probably requires some experience in working with bioinformatics data. Don't just throw in features because you think it is cool. Think about how useful it is. Don't waste your time/clutter up your UI with features you don't need.
ADD COMMENTlink written 2.0 years ago by Damian Kao10k

Thank you for your tips! It will definitely save a lot of time for me.

ADD REPLYlink written 2.0 years ago by Arun1.9k
1
gravatar for Istvan Albert
2.0 years ago by
Istvan Albert ♦♦ 39k
University Park, USA
Istvan Albert ♦♦ 39k wrote:

What you are linking to is a web application (a computer program that is accessible through a webserver) with many layers of complexity. As you suspect building a web application like this needs multiple skill levels from programming in a computer language to javascript, databases and other.

ADD COMMENTlink written 2.0 years ago by Istvan Albert ♦♦ 39k

I'm a bioinformatician with programming experience in C/C++, perl, python, R etc.. So learning scripting languages or mysql is not an issue. However, what I'd like to know is the role each one of these languages play and the structure by which they are linked, so I can plan and read accordingly. Probably, I am being too vague...!?

ADD REPLYlink written 2.0 years ago by Arun1.9k
1

the best way to learn what is necessary is to look at a modern web app platform like Django or Ruby on Rails.

ADD REPLYlink written 2.0 years ago by Istvan Albert ♦♦ 39k
3

Or Catalyst (http://www.catalystframework.org/) if the questioner knows Perl...

ADD REPLYlink written 2.0 years ago by Alex Paciorkowski3.0k
1
gravatar for Fidel
2.0 years ago by
Fidel540
Germany
Fidel540 wrote:

1) Does this require java script and/or PHP/MySQL? If so, could you please elaborate it as well. 2) Is it possible to configure to run an R-script based on client-side parameters so that it could be executed on the fly on the cluster and the out pdf/image file returned for display?

Most likely you will need other programming language besides R to run a webserver that shows what you want. This language will be in charge of interpreting the user request, process it and return and output. Here you can see how to call R from PHP. Besides PHP perhaps the best choice is Django, it is based on Python and for this programming language you can embed R code using rpy2. Here is the repository for a Django site that demonstrates the use of rpy. I have not tried it, but could be a good start for you.

Javascript and MySQL are not obligatory. You can use javascript to enhance the usability as already mentioned but I only recommend it after you have a working prototype. A relational database could be useful for you to cache the generated images. But again, you may only want to do this once you have a working prototype.

3) Is it better to have the files ready and just query the image or to create the required image dynamically as explained in 2)?

A good strategy is to generate the image the first time a user asks for it, then save it and the next time a user makes the same request you have the image ready. However, if the generation of the image is quite fast just avoid the overhead of saving them.

You could use MySQL to store the images, but also a simple folder where each image has a unique name will work. In general, the design that you choose depends on the volume of data being generated / processed, the memory and storage space that you have and the speed in which the servers can respond.

4) If I have left out anything or is unclear or you have other pointers, I'd love to hear them.

Start using the model-view-controller pattern from the beginning. It will help you to get a cleaner and better code. Django is based on this pattern.

ADD COMMENTlink written 2.0 years ago by Fidel540
1

MySQL is a poor choice to store images. You could store references to the images in there, but images are better stored in an object store and you should cache the images if possible (as recommended). Agree that if image generation is fast, you should create on the fly, but if you get traffic, that becomes an interested distributed processing problem which needs to be though.

ADD REPLYlink written 2.0 years ago by Mndoci1.1k
1
gravatar for Giovanni M Dall'Olio
2.0 years ago by
London, UK
Giovanni M Dall'Olio18k wrote:

Another website from which you can take inspiration is the ENCODE RNA-seq dashboard.

I am not sure but I think that it is realized using the Google/Charts APIs. If you are new to web development and data visualization, it may be a good way to start. If you still prefer R, have a look at the APIs from opencpu.

ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by Giovanni M Dall'Olio18k
1
gravatar for Vitis
2.0 years ago by
Vitis1.3k
New York
Vitis1.3k wrote:

I was involved in the data end of this resource website. The web site itself was modified from the Arabidopsis eFP browser, which was developed in the University of Toronto (http://bar.utoronto.ca/efp/cgi-bin/efpWeb.cgi). It was largely built with python and perl with mysql in the backend. It was quite involved to make it running but I may go ahead and try addressing several of your questions.

This is built with python and perl CGI with mysql in the backend. For the details, you can pick up the manual and start reading it. I think eFP browser is completely open. The major chunk of code was from eFP package and I believe most of them are python.

I have a simpler version of this without the fancy graphics. I made it through perl CGI, mysql and the GD graphing library. So, I have all the data stored in mysql and when it receives the queries, it grabs the values from database and graph it on-the-fly. I think GD could certainly be replaced with R graphing functionalities, but I think for simple graphs, GD is much more efficient.

ADD COMMENTlink written 2.0 years ago by Vitis1.3k
Please log in to add an answer.

Help
Access
  • RSS
  • Stats
  • API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.0.0
Traffic: 619 users visited in the last hour