Building A Webpage For Accessing Rna-Seq Expression Data
6
6
Entering edit mode
12.0 years ago
Arun 2.4k

Hello all,
I would like to get some hints to go about this task. I'd like to design a webpage/site that looks like this.
I have RNA-Seq data with replicates for WT and Mutants and also RPKM expression values. I also wrote an R-script that generates the shape and shades the expression values. Now the part I am not sure of is about the web-design itself, however I'd love to learn to do it. I know basic web-design per-se, just HTML with CSS. But I haven't used scripting and would love to make use of this opportunity to learn it.

Here are some of the questions I have:
1) Does this require java script and/or PHP/MySQL? If so, could you please elaborate it as well.
2) Is it possible to configure to run an R-script based on client-side parameters so that it could be executed on the fly on the cluster and the out pdf/image file returned for display?
3) Is it better to have the files ready and just query the image or to create the required image dynamically as explained in 2) ?
4) If I have left out anything or is unclear or you have other pointers, I'd love to hear them.

Thank you very much!
Arun

rna-seq gene-expression • 7.3k views
ADD COMMENT
1
Entering edit mode

The source code for the eFP Browser is available on SourceForge, Arun. http://sourceforge.net/projects/efpbrowser/. Not sure if this helps you after 8 months!

ADD REPLY
0
Entering edit mode

Is this for a hobby project or are you planning to build a resource?

ADD REPLY
0
Entering edit mode

It is supposed to be a resource ultimately (its not of absolute necessity to the people who hold the data at the moment). It is basically up to me and my capability as of now as they are not planning to realize it yet. So I have time to play around.

ADD REPLY
1
Entering edit mode

If this is going to be a resource, I would recommend that in the long term you think about dedicated software resources (the resource could be yourself). A resource is a lot more than Javascript and you need to think about things like user authentication, data management, other policies as well.

ADD REPLY
2
Entering edit mode
12.0 years ago

1) Does this require java script and/or PHP/MySQL? If so, could you please elaborate it as well.

Yes. Javascript to increase the usability of UI. PHP/MySQL for UI(front-end)-database(backend) interaction. Look at this discussion for various aspects of web programming for bioinformatics.

2) Is it possible to configure to run an R-script based on client-side parameters so that it could be executed on the fly on the cluster and the out pdf/image file returned for display?

Yes. This is possible, check this discussion here for different approaches to integrate R in a web application

3) Is it better to have the files ready and just query the image or to create the required image dynamically as explained in 2) ?

This depends on your requirement. You may have to talk to the biologist about the use of image and assess whether it is computationally worth to store them in advance or to create it as required.

4) If I have left out anything or is unclear or you have other pointers, I'd love to hear them.

It is not clear what type of data you are going to store in the database. You may look consult advanced database solutions like NoSQL that can handle scalability of RNA-seq data and modern web-application framework in your language of interest than creating an app from scratch.

ADD COMMENT
0
Entering edit mode

That's great! Thanks for providing me quite a bit of info to get started. I'll come back with more specific questions if I run into trouble. I'll wait a while before accepting this as an answer.

ADD REPLY
2
Entering edit mode
12.0 years ago

3) Is it better to have the files ready and just query the image or to create the required image dynamically as explained in 2)?

  • I would go with client-side rendering if you are confident in your javascript skills. Have the back-end send data in JSON or XML. Use javascript to render your data in svg. I recommend looking at d3.js.
  • There are tons of tricks you can do to make your client-side rendering more efficient. Most basic one is to tile your tracks like google maps.

4) If I have left out anything or is unclear or you have other pointers, I'd love to hear them.

  • Learning client-side coding with javascript/html/css will help you a lot. It will give you an equal computational load between server-side and client-side and make your applications richer in functionality.
  • Writing efficient javascript is not as simple as you might think. I would read up on topics like closures/scoping/patterns in javascript. I personally think it's a very misunderstood language (also my favorite).
  • If you do use a lot of javascript for renderings, try not to use too many libraries (especially jquery). Most libraries are not optimized for displaying large amounts of data.
  • Go for simple. Don't make the user go through 3 pages full of menus and options before they see their data.
  • Do not over-engineer your backend. Do you really need mySQL? Think about what you want to present. Maybe go for a noSQL solution?
  • Think user-centric. This part probably requires some experience in working with bioinformatics data. Don't just throw in features because you think it is cool. Think about how useful it is. Don't waste your time/clutter up your UI with features you don't need.
ADD COMMENT
0
Entering edit mode

Thank you for your tips! It will definitely save a lot of time for me.

ADD REPLY
1
Entering edit mode
12.0 years ago

What you are linking to is a web application (a computer program that is accessible through a webserver) with many layers of complexity. As you suspect building a web application like this needs multiple skill levels from programming in a computer language to javascript, databases and other.

ADD COMMENT
0
Entering edit mode

I'm a bioinformatician with programming experience in C/C++, perl, python, R etc.. So learning scripting languages or mysql is not an issue. However, what I'd like to know is the role each one of these languages play and the structure by which they are linked, so I can plan and read accordingly. Probably, I am being too vague...!?

ADD REPLY
1
Entering edit mode

the best way to learn what is necessary is to look at a modern web app platform like Django or Ruby on Rails.

ADD REPLY
3
Entering edit mode

Or Catalyst (http://www.catalystframework.org/) if the questioner knows Perl...

ADD REPLY
1
Entering edit mode
12.0 years ago
Fidel ★ 2.0k

1) Does this require java script and/or PHP/MySQL? If so, could you please elaborate it as well. 2) Is it possible to configure to run an R-script based on client-side parameters so that it could be executed on the fly on the cluster and the out pdf/image file returned for display?

Most likely you will need other programming language besides R to run a webserver that shows what you want. This language will be in charge of interpreting the user request, process it and return and output. Here you can see how to call R from PHP. Besides PHP perhaps the best choice is Django, it is based on Python and for this programming language you can embed R code using rpy2. Here is the repository for a Django site that demonstrates the use of rpy. I have not tried it, but could be a good start for you.

Javascript and MySQL are not obligatory. You can use javascript to enhance the usability as already mentioned but I only recommend it after you have a working prototype. A relational database could be useful for you to cache the generated images. But again, you may only want to do this once you have a working prototype.

3) Is it better to have the files ready and just query the image or to create the required image dynamically as explained in 2)?

A good strategy is to generate the image the first time a user asks for it, then save it and the next time a user makes the same request you have the image ready. However, if the generation of the image is quite fast just avoid the overhead of saving them.

You could use MySQL to store the images, but also a simple folder where each image has a unique name will work. In general, the design that you choose depends on the volume of data being generated / processed, the memory and storage space that you have and the speed in which the servers can respond.

4) If I have left out anything or is unclear or you have other pointers, I'd love to hear them.

Start using the model-view-controller pattern from the beginning. It will help you to get a cleaner and better code. Django is based on this pattern.

ADD COMMENT
1
Entering edit mode

MySQL is a poor choice to store images. You could store references to the images in there, but images are better stored in an object store and you should cache the images if possible (as recommended). Agree that if image generation is fast, you should create on the fly, but if you get traffic, that becomes an interested distributed processing problem which needs to be though.

ADD REPLY
1
Entering edit mode
12.0 years ago

Another website from which you can take inspiration is the ENCODE RNA-seq dashboard.

I am not sure but I think that it is realized using the Google/Charts APIs. If you are new to web development and data visualization, it may be a good way to start. If you still prefer R, have a look at the APIs from opencpu.

ADD COMMENT
1
Entering edit mode
12.0 years ago
Vitis ★ 2.5k

I was involved in the data end of this resource website. The web site itself was modified from the Arabidopsis eFP browser, which was developed in the University of Toronto (http://bar.utoronto.ca/efp/cgi-bin/efpWeb.cgi). It was largely built with python and perl with mysql in the backend. It was quite involved to make it running but I may go ahead and try addressing several of your questions.

This is built with python and perl CGI with mysql in the backend. For the details, you can pick up the manual and start reading it. I think eFP browser is completely open. The major chunk of code was from eFP package and I believe most of them are python.

I have a simpler version of this without the fancy graphics. I made it through perl CGI, mysql and the GD graphing library. So, I have all the data stored in mysql and when it receives the queries, it grabs the values from database and graph it on-the-fly. I think GD could certainly be replaced with R graphing functionalities, but I think for simple graphs, GD is much more efficient.

ADD COMMENT

Login before adding your answer.

Traffic: 2471 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6