Question: how to fix the error - Too many open files - by changing code in python3.7
0
gravatar for zhangdengwei
13 months ago by
zhangdengwei60
zhangdengwei60 wrote:

I am trying to write two files using python, and I encountered the following error:

OSError: [Errno 24] Too many open files: 'unmatch.site_3.txt'.

The main thing made me confused is the code can work well when I input two smaller files with the size of 39M and 61M, however, it died when the input file became as large as 87M. I have sought some solutions but I have no permission to change the Linux system's setting. Is there another way to handle it like through modifying my code. Thanks.

with open(offtarget_output, "w") as fo:
    with open(unmatch_output, "w") as fu:
        header = ["Chromosome", "strand", "Coordinate", "off-target site", "Mutation type", "Distance score", "Reads"]
        print(*header, sep="\t", file=fo)
        print(*header, sep="\t", file=fu)
        for key, value in sorted(match_dic.items(), key=lambda item:item[1][-1], reverse=True):
            if value[4] != "*":
                print(value[0], value[4], key, value[1], value[2], value[3], str(int(value[5])), sep="\t", file=fo)
            else:
                print(value[0], value[4], key, value[1], value[2], value[3], str(int(value[5])), sep="\t", file=fu)
software error • 2.6k views
ADD COMMENTlink modified 13 months ago by gb1.8k • written 13 months ago by zhangdengwei60
1

another thing, you can also do (or should do):

with open(offtarget_output, "w") as fo, open(unmatch_output, "w") as fu:
ADD REPLYlink written 13 months ago by gb1.8k

it still can't work, and I could not find which code is critical.

ADD REPLYlink written 13 months ago by zhangdengwei60

You have to post the full script then. Otherwise it’s just guessing.

ADD REPLYlink written 13 months ago by Michael Dondrup47k

Are you the only user of the server? Did you already checked the max open files?

ADD REPLYlink written 13 months ago by gb1.8k

How do you build match_dic? Do you wrap a loop around the code block above (is the 3 in unmatch.site_3.txt a incremental counter)? In case you are, either you generate too many file handles without closing them or something prevents the with statement to close them.

ADD REPLYlink modified 13 months ago • written 13 months ago by Carambakaracho2.2k

Thanks! I found the below code can work, but I am still not clear about the cause, the dictionary shouldn't be sorted in the loop?

match_dic = sorted(match_dic.items(), key=lambda item:item[1][-1], reverse=True)

with open(offtarget_output, "w") as fo:
    with open(unmatch_output, "w") as fu:
        header = ["Chromosome", "strand", "Coordinate", "off-target site", "Mutation type", "Distance score", "Reads"]
        print(*header, sep="\t", file=fo)
        print(*header, sep="\t", file=fu)

        for key, value in match_dic:
            if value[4] != "*":
                print(value[0], value[4], key, value[1], value[2], value[3], str(int(value[5])), sep="\t", file=fo)
            else:
                print(value[0], value[4], key, value[1], value[2], value[3], str(int(value[5])), sep="\t", file=fu)
ADD REPLYlink modified 13 months ago • written 13 months ago by zhangdengwei60

Hi, I am not an expert in python, but it is clear to me that you are trying to open too many files. Changing ulimits will only conceal symptoms if it helps at all, unless your ulimit is set to unreasonably low values. What I can see is that your code is just an excerpt, so it might well be that you are opening these files in a loop and therefore have many more open files than you expect.

ADD REPLYlink written 13 months ago by Michael Dondrup47k

The with statement in Python simplifies exception handling by encapsulating common preparation and clean-up tasks in so-called context managers. This allows common try..except..finally usage patterns to be encapsulated for convenient reuse and reduce the amount of code you need to write for handling different kinds of exceptions. The with statement creates resources within a block . You write your code using the resources within the block. When the block exits the resources are cleanly released regardless of the outcome of the code in the block (that is whether the block exits normally or because of an exception).

ADD REPLYlink written 10 months ago by quincybatten0
0
gravatar for GokalpC
13 months ago by
GokalpC50
Turkey/Ankara/Intergen
GokalpC50 wrote:

You need to increase the open file limit from

/etc/security/limits.conf

If your distro is systemd based sometimes there are other files to edit to get there such as

/etc/systemd/system.conf
ADD COMMENTlink written 13 months ago by GokalpC50
3

Respectfully, I disagree. From what I read, OP opens exactly two files, the error might be an indication of some recursion or loop not shown in the code.

ADD REPLYlink modified 13 months ago • written 13 months ago by Carambakaracho2.2k

Indeed, if one ever hits these limits, the first thing is to check the code, not increase the limits.

ADD REPLYlink written 13 months ago by Michael Dondrup47k

Agreed however some algorithms or even some old software or software coded by 3rd party requires these adjustments for proper functioning. There is really nothing to do but modify system settings in those cases.

One thing that I noticed is that file ops do not close the buffers therefore those open files could remain unclosed after writing data or buffering data. Is there a possibility that each write is flushed and file is closed after that.

ADD REPLYlink written 13 months ago by GokalpC50

Not sure if I understand you so please correct me if I am wrong. But that does that with open part. with makes sure that if that block of code is done it will be flushed/cleaned and closed.

EDIT: source https://www.pythonforbeginners.com/files/with-statement-in-python

ADD REPLYlink modified 13 months ago • written 13 months ago by gb1.8k

You are right. However file write is not done through fo.write() or fu.write() routines. Maybe print statement is not checking the fu and fo being opened as one file but tries to open and write at each time this for loop proceeds. I would suggest write file lines to be modified like the ones given in the link in your post.

ADD REPLYlink written 13 months ago by GokalpC50

Ah! like that. I understand you now, but dont know the answer. According to this page https://stackoverflow.com/questions/36571560/directing-print-output-to-a-txt-file-in-python-3 I dont think it necessarily causes op's problem. But some one else can maybe confirm it or not.

ADD REPLYlink written 13 months ago by gb1.8k
0
gravatar for gb
13 months ago by
gb1.8k
gb1.8k wrote:

Try first to use top, htop or ps. Then when you know the id of your process you can:

cat /proc/1882/limits

replace 1882 with your id.

I think my source for this check/solution was this site:

https://underyx.me/articles/raising-the-maximum-number-of-file-descriptors

ADD COMMENTlink written 13 months ago by gb1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1638 users visited in the last hour