gnomAD4.0 Hail Table Downloading
1
0
Entering edit mode
12 months ago
adarsh_munna ▴ 50

Hi,

I have downloaded the gnomAD V4.0 Sites Hail Table from their website using gsutils.

gsutil -m cp -r gs://gcp-public-data--gnomad/release/4.0/ht/exomes/gnomad.exomes.v4.0.sites.ht/* .

Now in one of the directory (rows) the hail table was supposed to be there. However this directory contains metadata.json.gz and a subdirectory parts with contains many binary files:

part-0132-d93b0eee-7fa0-42eb-ae6b-953c31ec8e3b
part-0168-f86f8639-0536-46b4-969e-93b0ce16af55
..
..

Is there any way to combine all these to get a single hail table, or is there anything wrong in what I did?

Is there any way to get a single hail table?

Please let me know

Thank you

NGS gnomAD • 1.3k views
ADD COMMENT
0
Entering edit mode

Do you have a Spark environment set up for hail usage or do you just need the data?

ADD REPLY
0
Entering edit mode

I need the data. So that it can be loaded and worked up on using the hail package of python

ADD REPLY
1
Entering edit mode

I'd recommend you go with the VCF downloads. I don't know what it takes to use the hail tables - from my limited understanding, the ht is going to be massive with all the part files and I've heard a colleague describe using distributed data storage like Hadoop/Spark to leverage the full power of hail, so I'm not sure you should do the hail tables if you don't have quite of bit of experience with hail or someone to guide you through it.

ADD REPLY
0
Entering edit mode
8 months ago
DBScan ▴ 450

You already downloaded the correct HAIL table, it's called gnomad.exomes.v4.0.sites.ht. Just load it via hl.read_matrix_table() in HAIL.

ADD COMMENT

Login before adding your answer.

Traffic: 1570 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6