Question

Large-scale open-source bioinformatics datasets

11

Entering edit mode

8.6 years ago

Tobias ▴ 150

Of course, as a bioinformatician, I am aware of many large-scale open-source bioinformatics datasets, such as

The ENCODE consortium (www.encodeproject.org, RNA-Seq, ChIP-Seq and so on),
The Roadmap Epigenomics consortium (www.roadmapepigenomics.org, RNA-Seq, Chip-Seq, Bilsulfite-Seq),
The IHEC consortium (www.ihec-epigenomes.org, RNA-Seq, Chip-Seq, Bilsulfite-Seq),
The TCGA/ICGC consortia (www.cancergenome.nih.gov, www.icgc.org, large-scale cancer data, DNA-Seq, RNA-Seq, etc.) and
The LINCS consortium (www.lincscloud.org/l1000 , gene expression for more than a million of different perurbation experiments).

I am wondering, however, what other wonderful datasets, the are both large and open-source, are currently available. That might include things like RNA-Seq, Chip-Seq, Bisulfite-Seq, whole genome sequencing, WGAS, and many other assays (not necessarily NGS-related, though that is what I am mostly looking for).

Also things like the (neural) connectome of certain species (in any event large data) could be of interest.

There are quite some GEO datasets that at least partially fulfill these requirements, but most are simply having to few data samples in order to be interesting to me.

Your comments are greatly appreciated!

RNA-Seq next-gen ChIP-Seq • 8.2k views

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by Tobias ▴ 150

0

Entering edit mode

To all those who replied: Many thanks for your detailed posts!

ADD REPLY • link 8.5 years ago by Tobias ▴ 150

0

Entering edit mode

Do you know if there is any other resource providing DNAse-seq and mRNA-seq data, other than ENCODE and Roadmap ?

ADD REPLY • link 8.1 years ago by Bioinformatist Newbie ▴ 270

0

Entering edit mode

6.9 years ago

Samuel Lampa ★ 1.3k

Human Protein Atlas

ADD COMMENT • link 6.9 years ago by Samuel Lampa ★ 1.3k

Ram · Accepted Answer · 2015-09-01

2

Entering edit mode

8.6 years ago

GouthamAtla 12k

FANTOM
RegulomeDB (Not a large scale, but very useful functional database)
1000Genome Project
GoNL

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by GouthamAtla 12k

Ram · Accepted Answer · 2015-09-01

1

Entering edit mode

8.6 years ago

Katie D'Aco ★ 1.1k

BioGPS is a good one for expression data.

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by Katie D'Aco ★ 1.1k

Ram · Accepted Answer · 2015-09-01

1

Entering edit mode

8.6 years ago

Jean-Karim Heriche 27k

Loss of function phenotypes: GenomeRNAi.

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by Jean-Karim Heriche 27k

Ram · Accepted Answer · 2015-09-02

1

Entering edit mode

8.6 years ago

Prakki Rama ★ 2.7k

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by Prakki Rama ★ 2.7k

Ram · Accepted Answer · 2015-09-04

1

Entering edit mode

8.6 years ago

osullivanchristopher ▴ 210

SRA maybe? http://www.ncbi.nlm.nih.gov/sra

2.2 Petabases open source, 1.8 Petabases authorized access. (btw, TCGA is authorized access not open access)

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by osullivanchristopher ▴ 210