Galaxy local installation - download reference genomes
Entering edit mode
8.4 years ago

I have installed a local instance of galaxy and a couple of tools on an Ubuntu server. My intention is to get the GATK pipeline for variant calling implemented. To run bwa I need a reference genome so I need to get one from somewhere.

Personally I find the documentation for Galaxy very confusing and "all over the place". However, if I understood things correctly I needed to install the rsync data manager - which I did.

At this point, when I click on the "Local data" menu item which is under the "Data" heading in the menu panel on the left, I then have an item, "Reference Genome - fetching" under the "Run Data Manager Tools" section in the big panel to the right of the menu panel. Clicking on that I get a form to fill in. I selected hg19 for this and then clicked "Execute".

Looking in the "Manage Jobs" area, this job has now been running for more than 15 hours.

What is this doing? Is it downloading the reference genome? Where is it downloading to so that I can check the progress? Is it downloading indices too (because I could not find any explanation of how this is done in the docs and videos that I looked at)?

I appreciate any and all help.

Kind Regards Jannnetta

galaxy genome • 3.6k views
Entering edit mode

are you behind a http proxy?

Entering edit mode
8.4 years ago

A reply for most of this question is at Galaxy Biostars:

That reply does not address why the job is running for so long. hg19 as contained at the Galaxy Rsync server is not appropriate for GATK. Use hg_g1k_v37 instead.

Where data managers are placing data is configured in this file, per instance: config/galaxy.ini

If others would like to comment or add in more help, here or there, it is welcomed.

Jen, Galaxy team


Login before adding your answer.

Traffic: 2906 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6