what is the volumn of data on GDC?
1
0
Entering edit mode
5.9 years ago
jonessara770 ▴ 240

Hi

How can I find how much data are on GDC portal for all cancer types and samples?

or is there any statistic for the volume of NGS data generated in cancer so far?

Thanks S

next-gen • 1.1k views
ADD COMMENT
0
Entering edit mode
5.9 years ago

Writing a report or something?

The two main repositories of cancer data that come to mind are the GDC (Genomic Data Commons), which is mostly TCGA data, and the ICGC (International Cancer Genome Consortium) Data Portal, which is an amalgamation of multiple global repositories that also includes the GDC.

Even if you just go to these websites, you can see an estimate of the size of the controlled and open access data that each contains:

Genomic Data Commons (GDC) ( https://portal.gdc.cancer.gov/repository ):

gdc

-------------------------------------------------

International Cancer Genome Consortium (ICGC) Data Portal ( https://dcc.icgc.org/repositories ):

iocgc


So, even in those 2, you're looking at upward of 1.8 petabytes. I cannot confirm that this actually represent all files that have ever been produced - most likely not. Also, this, of course, does not reflect the cancer data that is also stored in all of the other online repositories, such as:

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 2144 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6