Question: Newbie Question: How to find overlap of multiple cellline datas
gravatar for skim
24 months ago by
skim30 wrote:

Hello. I have a bed file with about 90 MBs and I need to find the overlaps between multiple bed files (sums to about 800MB) each containing sequences using Python. I have enough processing power but I need to simpify this process. I suspected using an interval tree was a good choice and found this: but I could not get further.

I have about 60 cell line datas each with a BED file about 6-10 MBs. I have a directory containing directories of the names of .bed files and .pk (peak files) and each of these directories have one bed file. Cell line Overlaps

Is it possible for anyone to give me a specific advice on how to do this task? Thank you very much.

Main .bed file example queries:

chr20 30053341 30053368 DEFB124 70.6955419 +

chr20 30053397 30053424 DEFB124 63.90851928 +

.pk cellline file example queries:

chr1 713835 714424 chr1.1 1000 . 0.1621 10.6 -1 253

chr1 752775 753050 chr1.2 567 . 0.0365 2.09 -1 124

.bed cellline file example queries

chr1 91425 91575 id-4576 9

chr1 714005 714155 id-35705 186.000000

ngs crispr overlap • 650 views
ADD COMMENTlink modified 13 months ago by Biostar ♦♦ 20 • written 24 months ago by skim30

Is it possible for anyone to give me a specific advice on how to do this task?

betools intersect

ADD REPLYlink modified 24 months ago • written 24 months ago by Pierre Lindenbaum121k

Thank you.... A very short yet powerful reply So I just have to use pybedtools and Python to search the files and feed to this: ??

ADD REPLYlink written 24 months ago by skim30

Can you give an example of what kind of query you are trying to do between input BED files? Are you trying to find all elements that are mutually overlapping in a set of N BED files, for instance? A straight-up intersection will not work here, in that case, because of overlaps within an input, etc. so a more sophisticated approach is needed there.

ADD REPLYlink written 13 months ago by Alex Reynolds28k

Thank you for your answer, but I finished this task 9 months ago :)

ADD REPLYlink written 13 months ago by skim30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1616 users visited in the last hour