Public servers for bioinformatics
2
0
Entering edit mode
22 months ago

Hi folks,

Are there any public (or private) servers where anyone can get an access to some computational resources to run bioinformatics (or whatever) analyses?

Imagine when you login to the servers of your research institution/company, get an interactive terminal session with certain resources, conda envs, scheduler, etc, but open for anyone.

Something like AWS, Azure, etc, sounds like obvious solutions, although I don't know if those are feasible if you are, say, a student, and not an enterprise.

Any suggestions are more than welcome.

computational resources • 1.2k views
ADD COMMENT
1
Entering edit mode

public as in free (or mostly so)? Other than Galaxy servers (where you will not get terminal access) there may be at most one or two examples (and they may require an application for an account). There is one supported by NSF but the name escapes me at the moment.

Found the project: https://cyverse.org/

Like one mentioned below it would not be appropriate for everyone or all uses.

ADD REPLY
0
Entering edit mode

thank you!!

ADD REPLY
2
Entering edit mode
22 months ago
jv ★ 1.8k

While not available to everyone, the Open Science Pool may be a good option for many - though options for interactive computational work is not supported, it's all batch job submission via HTCondor. If eligible, researchers can use Open Science Pool resources via OSG Connect, or another access point, for free. If you are not sure if you qualify all you need to do is send them a quick email to verify your eligibility:

"Any researcher affiliated with a research project associated with a United states institution (college, university, national laboratory, or non-profit organizations) is an OSG Connect user. Researchers affiliated outside of the U.S. who are collaborating on such a U.S.-based project may be sponsored by someone responsible for the project. Researchers outside of the U.S. are asked to first contact us directly to discuss membership."

For more details see https://opensciencegrid.org/services/open_science_pool.html and https://support.opensciencegrid.org/support/solutions/articles/5000634384

ADD COMMENT
0
Entering edit mode

Sorry, got removed by the spam bot, restored now.

ADD REPLY
0
Entering edit mode

thanks! this looks quite promising!

ADD REPLY
2
Entering edit mode
22 months ago
Wayne ★ 2.0k

MyBinder.org offers limited computational resources that don't require login. You can use Python, Jupyter, R, terminals, interactive dashboards in Voila or Shiny, shell-based resources and many other languages/abilities there. A substantial limitation, there to try and prevent abuse, is no outgoing FTP from the remote computers you session runs on in a container. Sessions on these remote machines are ephemeral though (another means to limit abuse), and so make sure you immediately download back to your local machine anything useful you generate beyond the included materials. I go in to general 'Getting Started' details here and describe the safety net only for Jupyter notebooks that is built in for when your session times out, here.

Samples of MyBinder use that I happen to have handy:

  • A snakemake tutorial that uses MyBinder.org to provide RStudio as the GUI, run by Titus Brown.

  • Example I made to illustrate some utility scripts for handling PDBePISA-based data is here. Go there and click launch binder to get started. The idea is that everything is installed and then after working through the examples, users can adapt them right there to analyze the structures of their favorite protein or protein-nucleic acid structures.

  • Metagenome-Atlas Tutorial in Jupyter from Metagenome-Atlas Tutorial. After the session launches, click the 'Python' directory in the navgation panel on the left and then open and run the notebooks Annotations.ipynb or Differential_abundance.ipynb.

  • Notebooks demonstrating how Clustergrammer2 can be used explore datasets in sessions served via MyBinder, go here and look for the 'launch' badges to get started.

Public offerings are going to vary by country and institution affiliations. I don't know about your country. In the United States, we have CyVerse for scientists that gives resources to scientists who register. You can share data and run a lot of tasks in what is essentially a managed version of Jetstream. I believe CyVerse has some international affiliations as well.

Commerical services are happy to sell you whatever resources you want to buy. A lot of them have free tiers or a trial time that allow you to run things. Some even offer educational grants of computer time to help people run training sessions. In fact, before MyBinder.org and CyVerse, this was how the training that got me started via what is now the 'Data Intensive Biology Summer Institute' was offered. And I have used that route over the years as sometimes you need more power, or flexibility, than MyBinder.org or CyVerse provide. Obviously arranging how to pay for these resources is the limitation. A lot of institutions now have affiliations with various providers. Often though it isn't easy if those connections don't already exist. In the United States, NIH has partnerships now with several of the big commercial cloud offerings, see here.

ADD COMMENT
0
Entering edit mode

thanks for such an elaborated answer!

ADD REPLY

Login before adding your answer.

Traffic: 1891 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6