Question: Gene expression estimates in the Epigenome Roadmap
0
gravatar for de.subhajyoti
5.2 years ago by
United States
de.subhajyoti20 wrote:

Hi, I am using Epigenome Roadmap data, and I can only find *.bed or *.wig files for mRNAseq and chipseq data (e.g. breast stem cell) for display on the genome browser. How I can get processed RPKM/FPKM expression level estimates and chipseq peak calls?

Of course, one could analyze the *.bed or*.wig files and estimate expression levels or peaks themselves, but I was wondering if processed data is already available.

Thanks, Subho

rna-seq chip-seq • 2.3k views
ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by de.subhajyoti20
1

WIG is probably expression level, and BED is probably peaks. Be sure to investigate what the EpigenomeRoadmap people have to say about those files.

edit: this BED looks like raw read names, and alignments. You could construct RPKM by carefully counting reads found at exon sites, or chipseq peaks at promoter regions. The WIG looks like read depth, so you can ignore that.

ADD REPLYlink modified 5.2 years ago • written 5.2 years ago by karl.stamm3.5k

I am not sure if that is the case. By looking at the first few lines of the files, I think the bed files provide info about individual mapped reads, while wig file provides depth of coverage at a per-bp basis.

$ less ~/Downloads/GSM543029_UCSF-UBC.Breast_Luminal_Epithelial_Cells.mRNA-Seq.RM035.bed | head -5
chr1    16159    16205    SOLEXA9_42:2:1:1529:310.L    -
chr1    118928    118978    SOLEXA9_42:2:3:447:886.R    -
chr1    122839    122889    SOLEXA9_42:2:2:1292:1782.L    +
chr1    134971    135018    SOLEXA9_42:2:2:474:588.R    -
chr1    135005    135055    SOLEXA9_42:2:1:303:1134.R    -

 

$ less ~/Downloads/GSM543029_UCSF-UBC.Breast_Luminal_Epithelial_Cells.mRNA-Seq.RM035.wig | head -10

track type=wiggle_0 visibility=full color=20,150,20
fixedStep chrom=chr1 start=11001 step=20 span=20
0
0
1
1
1
0
0
0

 

 

ADD REPLYlink written 5.2 years ago by de.subhajyoti20
0
gravatar for karl.stamm
5.2 years ago by
karl.stamm3.5k
United States
karl.stamm3.5k wrote:

Be sure to investigate what the EpigenomeRoadmap people have to say about those files.

It looks like your BED file reports reads from a Solexa machine. Reads reported as BED don't carry sequence information, and you'll have to parse their reference sequences, or annotate to feature regions. 

Check bedtools for intersectBed -c and it will count the hits along your genes BED.

The WIG just says there's some read at location 11041 through 11100. 

You'll have to trust their mapping strategy, so find their reference sequence, or better the raw data of course. 

 

ADD COMMENTlink written 5.2 years ago by karl.stamm3.5k
0
gravatar for de.subhajyoti
5.2 years ago by
United States
de.subhajyoti20 wrote:

I have now heard back from members of the Epigenetic Roadmap team about data availability.

  • Yes, right now you should only be able to access the .bed and .wig files which were submitted to the NCBI.
  • The peak calls and RPKM for the roadmap data is part of the submitted (currently under revision) analysis part of the consortium manuscript. 
  • This data will be available after the manuscript has been published.  There will be web resource links in the manuscript that links to the dataset.

And that addresses my question. Karl, yes, I could analyze the data myself the way you suggested, but given the response from the Epigenetic Roadmap team, I would just wait to work with their processed data.

ADD COMMENTlink written 5.2 years ago by de.subhajyoti20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1691 users visited in the last hour