Question: Overlapping regions in a BedGraph file
0
gravatar for Sam
15 months ago by
Sam70
United States
Sam70 wrote:

I downloaded a GRO-Seq dataset from http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE38140 and got a bedGraph file that's 4.2gb large.

I wanted to generate heatmaps of sense and antisense strands using deeptools, which requires a bigWig file format. I figured it'd be pretty simple using a bdg2bw script found here  but I recieved this error:

$ /Users/Carlos/Dropbox/bdg2bw Gro-Seq.bdg /Users/Carlos/Desktop/Projects/GenomicDataPublication_Project/Gencode/hg19.chrom.sizes 

Overlapping regions in bedGraph line 44 of Gro-Seq.bdg.clip

Not quite sure what to do to fix this. I know it's pretty simple in concept, just remove overlapping regions. But i'm not sure how exactly to go about doing that or if it would bias the results or something just as terrible, of the heatmap if I did.

Any ideas?

ADD COMMENTlink modified 15 months ago by Devon Ryan64k • written 15 months ago by Sam70
0
gravatar for Devon Ryan
15 months ago by
Devon Ryan64k
Freiburg, Germany
Devon Ryan64k wrote:

I haven't merged it into the master branch yet, but you can make bigWig files with pyBigWig in the WriterIntegration branch. That'd allow you to script something to fix the problematic line(s).

The fix would be something along the lines of:

import pyBigWig
lastLine = None
bw = pyBigWig.open("something.bw", "w")
bw.addHeader(some stuff)
for line in bedGraph.split():
    if lastLine:
        if line[0] == lastLine[0] and line[1] < lastLine[2]:
            #Do something appropriate
    lastLine = line
    bw.addEntries([line[0]], [line[1]], ends=[line[2]], values=[line[3]])
bw.close()

That'd give you a bigWig file and allow you to fix overlapping entries however you'd like.

ADD COMMENTlink modified 15 months ago • written 15 months ago by Devon Ryan64k

This looks promising. I'm not very (or at all) versed at scripting, but i'll give it a whirl and see if I can figure it out.

ADD REPLYlink written 15 months ago by Sam70

Go ahead and post again with what you have if you get stuck.
 

ADD REPLYlink written 15 months ago by Devon Ryan64k

Not surprisingly I got stuck quite quickly. Here's the edited script: (if you can call it that)

import pyBigWig
lastLine = None
bw = pyBigWig.open("Gro-Seq.bdg", "w")
bw.addHeader([("chr1", 249250621), ("chr2", 243199373), ("chr3", 198022430), ("chr4", 191154276), ("chr5", 180915260), ("chr6", 171115067), ("chr7", 159138663), ("chr8", 146364022), ("chr9", 141213431), ("chr10", 135534747), ("chr11", 135006516), ("chr12", 133851895), ("chr13", 115169878), ("chr14", 107349540), ("chr15", 102531392), ("chr16", 90354753), ("chr17", 81195210), ("chr18", 78077248), ("chr19", 59128983), ("chr20", 63025520), ("chr21", 48129895), ("chr22", 51304566), ("chrX", 155270560), ("chrY", 59373566), ("chrM", 16571)], maxZooms=0)
for line in bedGraph.split():
    if lastLine:
        if line[0] == lastLine[0] and line[1] < lastLine[2]:
            fout.write(line)
    lastLine = line
    bw.addEntries([line[0]], [line[1]], ends=[line[2]], values=[line[3]])
bw.close()

The error i'm receiving:

$ python bdgintobw.py

Traceback (most recent call last):

  File "bdgintobw.py", line 4, in <module>

    bw.addHeader([("chr1", 249250621), ("chr2", 243199373), ("chr3", 198022430), ("chr4", 191154276), ("chr5", 180915260), ("chr6", 171115067), ("chr7", 159138663), ("chr8", 146364022), ("chr9", 141213431), ("chr10", 135534747), ("chr11", 135006516), ("chr12", 133851895), ("chr13", 115169878), ("chr14", 107349540), ("chr15", 102531392), ("chr16", 90354753), ("chr17", 81195210), ("chr18", 78077248), ("chr19", 59128983), ("chr20", 63025520), ("chr21", 48129895), ("chr22", 51304566), ("chrX", 155270560), ("chrY", 59373566), ("chrM", 16571)], maxZooms=0)

AttributeError: 'NoneType' object has no attribute 'addHeader'

 

ADD REPLYlink written 15 months ago by Sam70

For whatever reason you couldn't open "Gro-Seq.bdg" for opening (are you using the "WriterIntegration" branch?). Also, I apparently introduced a bug recently where "maxZooms" doesn't work! I'll try to get that fixed today.

ADD REPLYlink written 15 months ago by Devon Ryan64k

maxZooms should now work as either a keyword or optional argument, though obviously if open() returns None then you're having a different issue.
 

ADD REPLYlink written 15 months ago by Devon Ryan64k

Hi, Is there any update about this post? Thanks. T.

ADD REPLYlink written 4 days ago by morovatunc310
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1797 users visited in the last hour