How To Select Only One Human Genome Build (Hg19) From The Encode Project'S Data
1
0
Entering edit mode
11.3 years ago
Eric Ho ▴ 10

While I looking on the ENCODE project's data in UCSC Genome Browser I discovered both data from hg18 and hg19 are shown even I only select the hg19 genome.

See http://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeOpenChromDnase

hg18 and hg19 are shown on origAssembly in Additional Details.

I need to know which data files are in hg19 format.

Thanks!

encode dataset • 2.5k views
ADD COMMENT
4
Entering edit mode
11.3 years ago

The original data may have been mapped to hg18 and then UCSC does the liftover of the coordinates to hg19. This file on the page you linked to contains all the information about the original data file submission. You should be able to filter this file on the origAssembly value like so:

curl http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeOpenChromDnase/files.txt | grep hg19

A simple line count tells me that there are 173 files mapped to hg19 and 264 mapped to hg18. Alternatively, you could probably open this file with a spreadsheet editor as it is delimited by semicolons.

ADD COMMENT

Login before adding your answer.

Traffic: 1391 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6