Sources Of Publicly Available Human Whole Genome Sequence Data
3
4
Entering edit mode
12.0 years ago

We're looking for publicly available human whole genome sequence data for the purposes of filtering variants out of our local data. We're primarily analyzing the exome from WGS data, but are also developing a non-coding analysis pipeline.

I am aware of the data offered by Complete Genomics here, and of course, 1000 Genomes.

Is anyone aware of other similar public sources of data? Thanks.

next-gen data • 9.6k views
ADD COMMENT
2
Entering edit mode
12.0 years ago
Prateek ★ 1.0k

TCGA hosts canacer sequencing data: https://tcga-data.nci.nih.gov/tcga/

COSMIC again is a database of cancer mutations but also includes those identified as germline mutations: http://www.sanger.ac.uk/genetics/CGP/cosmic/biomart/martview/

ADD COMMENT
0
Entering edit mode

Thanks, Prateek. I'll check these out. Not being a cancer geneticist, I'm not aware of all the resources on that side of the street, but in general it appears cancer researchers have larger repositories of WGS data at this point.

ADD REPLY
2
Entering edit mode
12.0 years ago
JC 13k

if you just want a compilation of several sources you can try: http://db.systemsbiology.net/kaviar/

ADD COMMENT
0
Entering edit mode

Thanks, Juan. We've used Kaviar for selected batches of variants, but it looks like the entire dataset can be downloaded locally. Do you know how much overlap there is with NHLBI's exome variant server?

ADD REPLY
0
Entering edit mode

Alex, yes you can install and run locally Kaviar, we used the Perl Module in our analysis scripts. The lastest version of Kaviar has all the NHLBI's variants.

ADD REPLY

Login before adding your answer.

Traffic: 1740 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6