I have a dataframe with the coordinates of large mutations (CNVs)
 chr start end
1   200   1000
1   400   800
1  600   1500
How can I identify zones that overlap and count occurrences?
The result for the dataframe given above would be something like this
 start end occurrences 
200    800    2
600    1000   2
600    800    3
In the HPC where I work, I have a few python libraries (e.i. pandas) and Bedtools. How could I do this?
That's what you looking for!
how would you do that with bedtools intersect ?
It's bad practice to ask a question without an output example. How exactly is output supposed to be formatted. Check
bedtools intersectas suggested, especially the counting option-wo. By the way, package managers likecondado not require root access so you always have the option to install most software you want with that, even on HPCs.I don't really know what you mean by "without an output example". I have provided the output I would like to achieve. The HPC I am using is the Genomic England Research Environment. In this HPC you cannot install anything if this is not previously provided by them. Not even using Conda.
I have checked the -wo option. But this mention that you need to bed file. I only have one. Not sure how to do what you mean.
Your partitioning in the example output is incorrect or at least inconsistent with the starting example dataframe. Nonetheless, BEDOPS
bedops --partitionis an easy way to do this correctly.