5.9 years ago by
Thank you for using NGSlib. This package is developed in my spare time so I am sorry that I don't have enough time to maintain it and update regularly.
The previous answers were correct about when to close the BigWigFile. But this is not the reason for your problem. The file handle doesn't take a lot of memory, and the memory will be released when the bw object is destroyed when it is out of its scope. See BigWigFile.__exit__().
The answers provided by joe.cornish826 and et. at. were correct. The problem you mentioned is not a problem of the code. It's the memory usage strategy of Python. When you load a whole chrom from a huge file into memory, it is considered as a big memory block and is reversed for Python to use it later. The memory will be released when the python script is finished. You may read related topics on the website. This is a common problem for advanced language. Most of the time, you don't need to care about the memory releasing since Python will do it for you. But you may have problem in some extreme cases.
The BigWigFile is actually a wrapper of Kent's UCSC codes. It is used for fast retrieval of items or depth in a short genomic region, not for the whole chromosome (use bash to do whole genome). When you want to go through the whole file, there are many other strategies. The recommended strategy are:
1. If you really want to fetch the the whole genome, you may use bash.
>BigWigToWig test.bw | yourscript_to_catch_standard_output_as_input.py
or covert it to wig file and manipulate on the wig file.
2. Put the fetching into a single script. When it is finished, Python will release the memory. Do not do multiple fetching in the same script.
3. Split your chromosome into bins (10K for example), fetch a bin at each time.
4. Follow codes in external/KentLib/wWigIO/wWigIO.c, find a way to call it in C instead of Python.
By the way, if you are going to get the scores at each position instead of all the items, you may use the pileup function. It returns a numpy array with length = end-start. Remember to split the chromosome into bins, otherwise the problem will still be there.
Hope this helps.
modified 5.9 years ago
5.9 years ago by
tszn1984 • 90