Question: Data Set Suitable For Comparing Wgs, Exome, And Rna-Seq Data Generated From The Same Samples
5
gravatar for Malachi Griffith
6.5 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith17k wrote:

Is anyone aware of a publication or pre-publication data set that includes the following for a single human tumor/normal sample pair:

  • Illumina whole genome sequence (WGS) to at least 30-50X depth for both tumor and normal (e.g. blood). Preferably this would be HiSeq paired end data generated with v3 chemistry
  • Illumina exome sequence data for the same tumor/normal pair but to considerably higher depth (say at least 150-200X)
  • Illumina RNA-seq data generated for the same tumor sample. Bonus points if there is also RNA-seq for a matched normal sample (same tissue as the tumor as opposed to blood DNA that would typically be used for a normal comparison at the DNA level).

Since this data would primarily be used for methods development, the tumor/normal pair could be from a tumor cell line and matched lymphoblastoid 'normal' cell line derived from the same individual. Some data along these lines is being made available here:

TCGA Mutation/Variation Calling Benchmark 4 at CGHub

https://cghub.ucsc.edu/benchmark_download.html

However, this is whole genome data only. No exome or RNA-seq data yet. Plenty of RNA-seq data can be found in GEO and elsewhere but I'm not aware of any projects where there is corresponding WGS or Exome data. Plenty of exome data and WGS data are being generated for TCGA but again I'm not aware of any publications describing combinations of all three types.

Large scale cancer sequencing projects that might have performed such a comparison:
The Cancer Genome Atlas (TCGA)
Cancer Genome Project (CGP)
International Cancer Genome Consortium (ICGC)

wgs rna-seq data exome • 5.8k views
ADD COMMENTlink modified 3.7 years ago • written 6.5 years ago by Malachi Griffith17k
1

Didn't explore it in detail, but I think this is what ICGC http://icgc.org/ is trying to do - put together all data in one place for the same set of patients. Dataset summary http://dcc.icgc.org/pages/summary/

ADD REPLYlink written 6.5 years ago by zx87547.9k

Hi Malachi,

Did you find some rnaseq data from normal/tumor pair on any cancers, which are processed.

 

ADD REPLYlink written 4.9 years ago by Chirag Nepal2.2k

But can I download matched normal/tumor paired exome data from icgc ? I want to work with bam files, is it possible to download them from icgc? I tried from TCGA but for WES  the samples are not open. I want to get access to normal/tumor serous ovarian cancer exome data. I want the aligned files. TCGA does not have it open but does any other portal have them open?

ADD REPLYlink written 4.6 years ago by ivivek_ngs4.8k
7
gravatar for Malachi Griffith
3.7 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith17k wrote:

We eventually created such a data set ourselves and made it publicly available via FTP here:

https://github.com/genome/gms/wiki/HCC1395-WGS-Exome-RNA-Seq-Data

The data are from a matched tumor/'normal' pair of cell lines: HCC1395 and HCC1395/BL whole genome (WGS), exome, and/or RNA-seq data.  All data are 2x100 bp reads generated on an Illumina HiSeq 2000 instrument. The exome data was generated by use of a NimbleGen SeqCap EZ Human Exome Library v3.0 reagent.

If you find this data useful, please cite:

Griffith et al. Genome Modeling System: A Knowledge Management Platform for Genomics. PLoS Comput Biol. 2015 Jul 9;11(7):e1004274. doi: 10.1371/journal.pcbi.1004274. eCollection 2015 Jul. PubMed PMID: 26158448;

Since these data corresponds to cell line material we were able to make them available without using dbGaP.

ADD COMMENTlink written 3.7 years ago by Malachi Griffith17k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1638 users visited in the last hour