Galaxy Workflow Management System Customization - Avoiding Duplication Of Files
1
5
Entering edit mode
13.5 years ago
toni ★ 2.2k

Hi all,

My team is trying to set up an instance of galaxy workflow management system that will launch jobs on our local cluster. We are involved in projects dealing with high-throughput sequencing. We then have to manage a LOT of large files (several Gb each).

When uploading files into galaxy, these files are automatically copied to a folder named "database/files" and they are also sequentially renamed (dataset_1, dataset_2, dataset_3 ...etc..). This name convention is independent of the fact that a file is an input/intermediary/output file.

Copying and renaming files this way is too much time consuming and makes us lose our file structure.

Is there a way to avoid this behavior and that galaxy just remember the file path instead of possessing his own copy ?

If someone here has experience with this tool, any help or useful link would be appreciated.

Cheers,
tony

galaxy next-gen-sequencing • 3.9k views
ADD COMMENT
1
Entering edit mode

You should ask this question on one of the galaxy mailing lists available at: http://lists.bx.psu.edu/listinfo

ADD REPLY
0
Entering edit mode

yes, right. I just wanted to have a try here. Thank you.

ADD REPLY
9
Entering edit mode
13.5 years ago

Yes, Details on the wiki under the heading 'Upload files from filesystem paths' Be sure to check No for the question 'copy data into galaxy'

ADD COMMENT
0
Entering edit mode

Thank you.. I have been through the wiki many times but was unable to find this page !

ADD REPLY
0
Entering edit mode

Exactly right. You also want to store your files in Data Libraries: http://bitbucket.org/galaxy/galaxy-central/wiki/DataLibraries/Libraries. Users can share files and copy them into their individual histories for processing without duplicating the datasets on disk.

ADD REPLY

Login before adding your answer.

Traffic: 2902 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6