Question: What Are Your Most-Used Public Data Repositories?
9
gravatar for Sean Davis
5.8 years ago by
Sean Davis25k
National Institutes of Health, Bethesda, MD
Sean Davis25k wrote:

If you were to catalog public data repositories that house public "omics" and other high-throughput data, what would you include? What are some of the public data repositories to which you have contributed or that you use regularly? In particular, I'd be interested in hearing about repositories or databases of raw omics data that are off-the-beaten-path but that are critical to your research.

Clarification: I am mainly interested in databases that collect and host omics data. I see, for example, that flybase seems to host some modENCODE RNA-seq data.

database • 3.5k views
ADD COMMENTlink modified 5.8 years ago by lwc628200 • written 5.8 years ago by Sean Davis25k

use: Sequence Ontology, Gene Ontology, NHLBI exome server, pox.org

ADD REPLYlink written 5.8 years ago by Zev.Kronenberg11k
5
gravatar for Dan D
5.8 years ago by
Dan D6.8k
Tennessee
Dan D6.8k wrote:

Definitely the 1000 genomes project:

http://www.1000genomes.org/data#DataAccess

ADD COMMENTlink written 5.8 years ago by Dan D6.8k
5
gravatar for brentp
5.8 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

By far we use the UCSC genome browser and resources the most. I use the mysql database quite a bit and use the browser to display our data overlaid on all the existing tracks.

http://genome.ucsc.edu/

ADD COMMENTlink written 5.8 years ago by brentp23k
4
gravatar for Charles Warden
5.8 years ago by
Charles Warden6.6k
Duarte, CA
Charles Warden6.6k wrote:

At the risk of stating the obvious, I most often download data from SRA and ArrayExpress (which also has some NGS data).

GEO is also useful for searching for relevant projects because GEO provides links to the corresponding SRA data.

TCGA is also a commonly used resource, but you typically have to get special permission to access raw data.

ADD COMMENTlink written 5.8 years ago by Charles Warden6.6k
1

+1 for TCGA. FWIW, TCGA's "special permission" generally just consists of letting them know what you're going to do with the data and filling out a form. They want the data to be easy to get and a community resource, but have to balance that against concerns about the release of clinical data.

ADD REPLYlink modified 5.8 years ago • written 5.8 years ago by Chris Miller20k
4
gravatar for Mary
5.8 years ago by
Mary11k
Boston MA area
Mary11k wrote:

UCSC mainly for me too. But I also use the InterMines for the ModENCODE data ( http://modencode.org/ ), and BioMart interface to get to stuff I need that's not at UCSC. That connects me to a lot of sources.

My needs are pretty random--sometimes I'll need a big list of fly gene symbols. And then I'll need some cancer data. Another one I turn to is the International Cancer Genome Consortium: http://icgc.org/

For microbial data I often go to IMG to see what's available. http://img.jgi.doe.gov/

ADD COMMENTlink written 5.8 years ago by Mary11k
4
gravatar for lwc628
5.8 years ago by
lwc628200
United States
lwc628200 wrote:

Ensemble(http://useast.ensembl.org/info/data/ftp/index.html). No?

I download all my references and annotations from here

ADD COMMENTlink modified 5.8 years ago • written 5.8 years ago by lwc628200
3
gravatar for Stephen
5.8 years ago by
Stephen2.7k
Charlottesville Virginia
Stephen2.7k wrote:

I use GEO frequently. dbGaP when I have to - access is painful.

ADD COMMENTlink written 5.8 years ago by Stephen2.7k
1
gravatar for zx8754
5.8 years ago by
zx87547.3k
London
zx87547.3k wrote:

We use 1000 genomes project, UCSC genome browser tables, TCGA, and we contribute to ICGC Prostate Cancer - http://icgc.org/icgc/cgp/70/508/71331

ADD COMMENTlink written 5.8 years ago by zx87547.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1249 users visited in the last hour