ENCODE integrated genomic annotation data access
1
0
Entering edit mode
9.6 years ago
xiaoheyiyh • 0

ENCODE integrated genomic annotation data access

I try to use ENCODE annotation data in my own data analysis. I couldn’t find a introductory tutorial on ENCODE data access and usage. So I explored a little on myself.

I found that the old data of ENCODE production phase (2007-2012) can be viewed and accessed from UCSC: http://genome.ucsc.edu/ENCODE/downloads.html

And newer data can be accessed at their new portal: https://www.encodeproject.org/

however, there is no simple and intuitive way to view and access at this new ENCODE project portal as from UCSC portal. Their programmatic access intro is a bit heavy and not informative. I don’t even find where are processed and integrated data like those from UCSC. Can anybody help on this or point to the right direction? Thank you!

encode genome • 1.8k views
ADD COMMENT
1
Entering edit mode
9.6 years ago

the Encode data are available as a UCSC track hub (http://genome.ucsc.edu/goldenpath/help/trackDb/trackDbHub.html):

This hub is defined at http://ngs.sanger.ac.uk/production/ensembl/regulation/hub.txt.

It provides data for the genomes defined in http://ngs.sanger.ac.uk/production/ensembl/regulation/genomes.txt.

For Hg19, a set of BigBed and BigWig files is defined in http://ngs.sanger.ac.uk/production/ensembl/regulation/hg19/trackDb.txt

those files can be directly handled using the UCSC standalone tools:

$ bigWigSummary -type=mean -udcDir=.  \
  "http://ngs.sanger.ac.uk/production/ensembl/regulation//hg19/segmentation_summaries/Segway_17/1.bw" \
  chr1 1  110301 1

1.23587

see also my blog: http://plindenbaum.blogspot.fr/2014/09/using-ensembl-regulatory-build-to.html

ADD COMMENT
0
Entering edit mode

Thank you, Pierre. This is helpful. But the data you pointed to is just Ensembl Regulatory Build, which only include Transcription factor sites and promoter related annotation.

I would like to access the all types of regulatory annotatons including DNase, FAIRE, Histone, and TFBS peaks. Ideally, it should be the latest data from the encodeproject.org portal, not just the old data from UCSC portal.

Any suggestions would be appreciated.

ADD REPLY

Login before adding your answer.

Traffic: 1978 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6