Question: Remotely Access Bigwig File With Python
2
gravatar for Ryan Dale
6.5 years ago by
Ryan Dale4.8k
Bethesda, MD
Ryan Dale4.8k wrote:

I use bx-python for reading local bigWig files, but I don't believe it works on remote files like bigWigSummary from kentsrc does, e.g.,

bigWigSummary <http://file.bigWig> chr1 1 100000 10

I suppose one option would be to loosely wrap bigWigSummary, but does anyone know of an existing way of accessing bigWigs remotely with Python?

python bigwig • 2.3k views
ADD COMMENTlink modified 6.5 years ago by Michael Schubert6.9k • written 6.5 years ago by Ryan Dale4.8k

I know this is probably not an option for you: perl bindings do support remote files.

ADD REPLYlink written 6.5 years ago by lh331k

Yeah, the goal is to integrate with a lot of other Python code. But thanks for this -- time to read up and see how they did it.

ADD REPLYlink written 6.5 years ago by Ryan Dale4.8k
0
gravatar for Michael Schubert
6.5 years ago by
Cambridge, UK
Michael Schubert6.9k wrote:

What you're asking is not exactly possible.

You pass the remote URL to bigWigSummary as command line argument. This means:

  • the program needs to be able to interpret the remote URL as such and
  • even if it does, programs on your computer still can't analyse remote data without at least temporarily fetching them

Possible solutions are:

  • download the file, run your program, and delete the file
  • have bigWigSummary on the remote machine where the data is and control it there (SSH)
ADD COMMENTlink written 6.5 years ago by Michael Schubert6.9k
1

The whole point of bigwig is to only download a tiny fraction of the remote file in the region we are interested in, as long as the ftp/http server allows to resume broken transfer. I do not know the bigwig APIs. For BAM/tabix, it is the C APIs that transparently connect to the remote server given a file name prefixed with "http://" or "ftp://". Perl/Python bindings do not need to worry about parsing URL or establishing the connection.

ADD REPLYlink modified 6.5 years ago • written 6.5 years ago by lh331k

Interesting point, but then the program would have to guess the byte ranges, download a chunk of the file, see if the required data is in there, and search like this with multiple (possibly hundreds of) requests?- unless there is some specific index that it could fetch before.

ADD REPLYlink written 6.5 years ago by Michael Schubert6.9k
3

For bam/tabix, users need to download a ~10MB index file to local disk. Once they have this file, they can access most of alignments/records with only one http/ftp request (<1.1 requests in average). The index is carefully tuned for fewer requests. BigBed is more advanced. The index is integrated in data. You do not need to download the entire index. A tradeoff though is that it requires more requests (I think about 8 in average) to jump in the index section of the file. BigWig should be similar.

ADD REPLYlink written 6.5 years ago by lh331k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1104 users visited in the last hour