pyBedTools.BedTools.intersect() not working on dataframe
2.2 years ago
fr ▴ 150

I am trying to do an intersect using pybedtools. Dataframe b was converted from pandas df to a .bed format by running my_b=pybedtools.BedTool.from_dataframe(my_b_df). I am then intersecting it with an existing .bed file that was imported using my_a=pybedtools.BedTool(/my/dir/file.bed). However, when I run my_a.intersect(my_b, loj=True) I get the following error:

BEDToolsError                             Traceback (most recent call last)
<ipython-input-14-efb6f4f09839> in <module>()
----> 1 my_a.intersect(my_b, loj=True)

~/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pybedtools/ in decorated(self, *args, **kwargs)
    838             # this calls the actual method in the first place; *result* is
    839             # whatever you get back
--> 840             result = method(self, *args, **kwargs)
    842             # add appropriate tags

~/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pybedtools/ in wrapped(self, *args, **kwargs)
    343             stream = call_bedtools(cmds, tmp, stdin=stdin,
    344                                    check_stderr=check_stderr,
--> 345                                    decode_output=decode_output,
    346                                    )

~/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pybedtools/ in call_bedtools(cmds, tmpfn, stdin, check_stderr, decode_output, encode_input)
    432                 sys.stderr.write(stderr)
    433             else:
--> 434                 raise BEDToolsError(subprocess.list2cmdline(cmds), stderr)

Command was:

    bedtools intersect -loj -b /some/dir/pybedtools.4tw53oln.tmp -a /my/dir/file.bed

Error message was:
Error: unable to open file or unable to determine types for file /some/dir/pybedtools.4tw53oln.tmp

- Please ensure that your file is TAB delimited (e.g., cat -t FILE).
- Also ensure that your file has integer chromosome coordinates in the 
  expected columns (e.g., cols 2 and 3 for BED).

Note that everything looks well when I run head /some/dir/pybedtools.4tw53oln.tmp

#Output (yes, there is a column with `True`)
chr1    3250000 3300000 True
chr1    4050000 4100000 True
chr1    4450000 4500000 True
chr1    4500000 4550000 True

What am I doing wrong? Or is this a bug?

Thanks a lot in advance

bedtools pybedtools

