I'm am performing GATK analysis on of hundreds of WGS in indexed BAM format. To speed up the processes and reduce wall clock time, I process the genomes per chromosome. To extract a chromosome I use samtools and this works fine on a NFS share. However , I ran this analysis on a file store that is not posix compatible. I have to download the whole file before chunking it in chromosomes, which is time consuming since the chromosomes are written back to the file store and download it again to process it.
I can download a file partially using a offset en size to download. Is is possible to get from the index the location of the chromosome, download only this chromosome, add some magic sauce to create a valid bam file and process this chromosome in one go instead op upload and downloading chunks?
Have you tried http://grifi.sourceforge.net?
Mounting gridftp is an idea for a cloud setup, however I am using the grid and I do not have super user rights to mount a device. The last visible work on grifi is almost 10 years ago and not sure if it works with the current globus software stack on which it depends.