Datasets For Which There Is Microarray And Rna-Seq Data On The Same Samples
Entering edit mode
10.9 years ago
Paul ▴ 760


I'm wondering if anyone is aware of datasets for which there is publicly available microarray and RNA-seq data on the same samples?

So far I've come up with the HapMap CEU and YRI samples:

And this paper about estimating accuracy of both platforms using proteomics:

If anyone is aware of any more off the top of their head I'd be very grateful!



rna microarray • 4.6k views
Entering edit mode
10.9 years ago
Ryan Dale 5.0k

It's in flies, but there are array and RNA-seq data for D. pseudobscura from:

Malone & Oliver (2011). Microarrays, deep sequencing and the true measure of the transcriptome. BMC Biology 2011, 9:34

GEO accessions:

  • GSE23309 (array)
  • GSE19989 (mRNA-seq)
Entering edit mode
10.9 years ago

The following manuscript describes some comparisons between RNA-seq, Affymetrix Exon Arrays, Custom NimbleGen Splicing Arrays, and qPCR/RT-PCR:

Nature Methods. 2010 Oct;7(10):843-847

Raw data files are publicly available.

These experiments involve two human cell lines each profiled as biological triplicates on all three platforms. Additional validation experiments are described in the manuscript.

qPCR validations are presented in Figure 2. The experiment involved qPCR validation of 192 differentially expressed exons identified in the RNA-seq data. An additional qualitative assay involving RT-PCR, gel electrophoresis and Sanger sequencing was used to validate 189 exon-skipping junctions (see Supplementary Figure 10a in the Supplementary Materials). The validation rates for the quantitative and qualitative assays were 88% and 85% respectively. The selection of targets for validation was not biased towards highly expressed exons. Actually it was biased towards lowly expressed exons that were nevertheless identified as expressed above background by RNA-seq. These likely represent minor isoforms in many cases (see Supplementary Figure 10b in the Supplementary Materials).

Entering edit mode
10.7 years ago

I think your best bet might be some of the TCGA datasets. I have personal experience with the breast set. They have AgilentG4502A_07_3 expression data for ~600 samples and RNA-seq for >800 samples with a significant overlap between the two sets. This will be published very soon, but you can probably already get access through back channels. You can also visualize and do some simple analysis through the UCSC Cancer Genome Browser. If you browse through the datasets you will probably find some others that also have both expression and RNAseq (e.g., TCGA Colon). Unfortunately I don't think they let you download complete datasets yet. But, I believe that they are working on becoming an official TCGA data portal to allow this.

TCGA Breast array and RNA-seq data

Entering edit mode
10.8 years ago
Pascal ▴ 160

These two studies use the same RNA from HEK293T and B cell lines for RNS-seq and Affymetrix exon arrays.

Data available at GEO: GSE13474 GSE11892


Login before adding your answer.

Traffic: 2180 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6