Newbie Question: How to find overlap of multiple cellline datas
0
0
Entering edit mode
6.7 years ago
skim ▴ 60

Hello. I have a bed file with about 90 MBs and I need to find the overlaps between multiple bed files (sums to about 800MB) each containing sequences using Python. I have enough processing power but I need to simpify this process. I suspected using an interval tree was a good choice and found this: https://pypi.python.org/pypi/intervaltree_bio but I could not get further.

I have about 60 cell line datas each with a BED file about 6-10 MBs. I have a directory containing directories of the names of .bed files and .pk (peak files) and each of these directories have one bed file. Cell line Overlaps

Is it possible for anyone to give me a specific advice on how to do this task? Thank you very much.

Main .bed file example queries:

chr20 30053341 30053368 DEFB124 70.6955419 +

chr20 30053397 30053424 DEFB124 63.90851928 +

.pk cellline file example queries:

chr1 713835 714424 chr1.1 1000 . 0.1621 10.6 -1 253

chr1 752775 753050 chr1.2 567 . 0.0365 2.09 -1 124

.bed cellline file example queries

chr1 91425 91575 id-4576 9

chr1 714005 714155 id-35705 186.000000

ngs overlap CRISPR • 1.9k views
ADD COMMENT
1
Entering edit mode

Is it possible for anyone to give me a specific advice on how to do this task?

betools intersect

ADD REPLY
0
Entering edit mode

Thank you.... A very short yet powerful reply So I just have to use pybedtools and Python to search the files and feed to this: https://daler.github.io/pybedtools/autodocs/pybedtools.bedtool.BedTool.intersect.html ??

ADD REPLY
0
Entering edit mode

Can you give an example of what kind of query you are trying to do between input BED files? Are you trying to find all elements that are mutually overlapping in a set of N BED files, for instance? A straight-up intersection will not work here, in that case, because of overlaps within an input, etc. so a more sophisticated approach is needed there.

ADD REPLY
0
Entering edit mode

Thank you for your answer, but I finished this task 9 months ago :)

ADD REPLY

Login before adding your answer.

Traffic: 1968 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6