Question: Suggestions For Extracting Data (A Challenge Of Sorts !)
gravatar for Atom Smasher
8.3 years ago by
Atom Smasher20
Atom Smasher20 wrote:

Hello all,

I have a problem which looks very complicated to me, but I am sure with suggestions from you folks, I'll go in the right direction.

I am still new to Bioinformatics data analysis and I would appreciate it if I could get ideas from you.

Ok... so, here we begin.

I wish to visualize a group's "peak-called" ChIP-Seq data in IGB(Integrated Genome Browser) for comparison with my research group's data. Specifically, I need to upload a particular chromosome's "bar" file into IGB for visualization. However, the peak analysis on the chromosome should have already been done.

The "bar" files that I am talking about are generated by a program called USEQ while calling peaks. However, the problem is that through Gene Expression Omnibus (GEO), I have access to the other group's following data :

1) Their raw data file 2) Their "eland_results.txt" file 3) Their "eland_export.txt" file 4) Their final peaks (bed) file

This other group does not use input files for calling peaks. So they do not have any input data. If they had input data available, I could have easily "re-processed" their raw file using their input data file and a peak calling program like USEQ. And then I would have easily got their "peak-called" chromosome "bar" files.

Also, I cannot simply convert their "final peak" files (existing in the "bed" format) to "bar" files as this would only give me the chromosome's "bar" files with only "peaks" in it. I wish to visualize the whole chromosome with "regions having peaks" and "regions not having peaks"

I hope I am not being very ambiguous. But how should I go about solving this problem ?

Thank you.

bed peak-calling • 1.8k views
ADD COMMENTlink written 8.3 years ago by Atom Smasher20

I'm not clear why the BED files would not get you most of the way there. These represent the peaks called by the authors, do they not? Could you explain why that would not allow you to 'visualize the whole chromosome with "regions having peaks" and "regions not having peaks"'?

ADD REPLYlink written 8.3 years ago by Sean Davis26k
gravatar for Istvan Albert
8.3 years ago by
Istvan Albert ♦♦ 84k
University Park, USA
Istvan Albert ♦♦ 84k wrote:

What you are most likely looking for (although it is not clear from your post) is the per base coverage over the entire genome. This coverage was used to call peaks but the peak data does not contain the actual shape of the peak.

Your best bet would be to transform your eland output to SAM format. Most genome browsers can generate the coverage from the SAM file.

ADD COMMENTlink written 8.3 years ago by Istvan Albert ♦♦ 84k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 863 users visited in the last hour