Question: GEO Illumina HumanHT-12 V3.0 Series never include Raw Data?
1
gravatar for Tim D
4.7 years ago by
Tim D70
Belgium, Leuven
Tim D70 wrote:

While gathering up data for a microarray meta analysis, I came across something odd. I hope I'm just being stupid and that I'm missing something blindingly obvious.

The GSE I'm looking at (GSE29312) was created using the Illumina HumanHT-12 V3.0 expression beadchip (GPL6947). Because I cannot with 100% certainty say what preprocessing and normalization steps have been performed to arrive at the deposited values, I was planning on re-processing them from the RAW data using the lumi bioconductor package. However, the submitted RAW data file is a 6.2 Mb file which just contains the GPL6947 bgx file: in other words, just the chip description.

Thinking I missed something, I looked for other datasets on the same platform, and I found that none of them actually contain the raw data (I didn't actually perform an exhaustive search, but none of the 12 randomly sampled datasets contained raw data.) They all just have that same 6.2 Mb description file.

So, what am I not seeing here? Are the raw data for illumina beadchips just never deposited? Because of size constraints maybe? Did I just get very unlucky in the datasets that I sampled? Are they stored somewhere that I overlooked?

microarray geo • 1.6k views
ADD COMMENTlink modified 9 months ago by Kevin Blighe67k • written 4.7 years ago by Tim D70

It does look like only processed data was deposited for these samples GSE29312.

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by genomax92k

I looked at some datasets on illumina bead chip and it seems that what you are looking for is the second file (GSE29312_non-normalized.txt.gz).If I am not mistaken, those are intensities measured by scanner (similar to affymetrix CEL files)

ADD REPLYlink written 4.7 years ago by minio.cz10
0
gravatar for Kevin Blighe
9 months ago by
Kevin Blighe67k
Republic of Ireland
Kevin Blighe67k wrote:

A late answer, but:

For some Illumina studies, the raw data IDAT files are available. These can be input to an EListRaw object via the illuminaio package. For other studies, like those where only a file of the form *_non-normalized.txt.gz is available, these can be read into R using standard functions and then coerced to an EListRaw object manually.

The annotation BGX file is simply a compressed, tab-delimited file. It can be read in manually, too.

With your data as an EListRaw object, proceed with the advice in Limma's manual ( see section 17.3 - https://bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf ), which essentially involves normalisation via neqc() followed by further probe filtering.

Kevin

ADD COMMENTlink modified 9 months ago • written 9 months ago by Kevin Blighe67k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1344 users visited in the last hour