Of course, as a bioinformatician, I am aware of many large-scale open-source bioinformatics datasets, such as
- the ENCODE consortium www.encodeproject.org, RNA-Seq, ChIP-Seq and so on),
- the Roadmap Epigenomics consortium www.roadmapepigenomics.org, RNA-Seq, Chip-Seq, Bilsulfite-Seq),
- the IHEC consortium www.ihec-epigenomes.org, RNA-Seq, Chip-Seq, Bilsulfite-Seq),
- the TCGA/ICGC consortia www.cancergenome.nih.gov, www.icgc.org, large-scale cancer data, DNA-Seq, RNA-Seq, etc.) and
- the LINCS consortium www.lincscloud.org/l1000 , gene expression for more than a million of different perurbation experiments).
I am wondering, however, what other wonderful datasets, the are both large and open-source, are currently available. That might include things like RNA-Seq, Chip-Seq, Bisulfite-Seq, whole genome sequencing, WGAS, and many other assays (not necessarily NGS-related, though that is what I am mostly looking for).
Also things like the (neural) connectome of certain species (in any event large data) could be of interest.
There are quite some GEO datasets that at least partially fulfil these requirements, but most are simply having to few data samples in order to be interesting to me.
Your comments are greatly appreciated!