GEO - Cannot find processed gene expression data ?
1
0
Entering edit mode
8.4 years ago

Hi all,

I would like to download the processed gene expression data published in the below mentioned publication.

Brady RA, Bruno VM, Burns DL. RNA-Seq Analysis of the Host Response to Staphylococcus aureus Skin and Soft Tissue Infection in a Mouse Model. Subbian S, ed. PLoS ONE. 2015;10(4):e0124877. doi:10.1371/journal.pone.0124877.

This publication says the processed gene expression data have been submitted to http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE56227.

Based on my experience GEO is structured so that the processed gene expression data is in the Series Matrix File(s) within "!series_matrix_table_begin" and "!series_matrix_table_end" tags.

However in this case there are only sample identifiers (not any intensity data) within these elements, the content copy-pasted below.

!series_matrix_table_begin
"ID_REF"    "GSM1435114"    "GSM1435115"    "GSM1435116"    "GSM1435117"    "GSM1435118"    "GSM1435119"    "GSM1435120"    "GSM1435121"    "GSM1435122"    "GSM1435123"    "GSM1435124"    "GSM1435125"    "GSM1435126"    "GSM1435127"    "GSM1435128"    "GSM1435129"    "GSM1435130"    "GSM1435131"    "GSM1435132"    "GSM1435133"    "GSM1435134"    "GSM1435135"    "GSM1435136"    "GSM1435137"    "GSM1435138"    "GSM1435139"    "GSM1435140"    "GSM1435141"    "GSM1435142"    "GSM1435143"
!series_matrix_table_end

Anybody has any idea what this is about?

Is the processed gene expression data perhaps somewhere else?

Or is it perhaps so that this data have simply not been submitted to GEO?

Thanks,
Erno Lindfors

GEO RNA-Seq • 1.9k views
ADD COMMENT
1
Entering edit mode
8.4 years ago
5utr ▴ 370

You can find the gene RPKM values for the different samples in GSE56227_RAW.tar in the download section.

Sometimes the processed gene expression is in the Series Matrix File but usually for array data.

ADD COMMENT
0
Entering edit mode

Thanks Gian for your response.

My next question is where I can find the column headers for the files that are in the GSE56227_RAW.tar package?

To me it looks these files contain ten columns but at least I cannot find any obvious explanation for their content.

I think most relevantly I would need to know which column contains to the RPKM values and which column(s) contain(s) gene identifiers.

Regarding the gene identifier, I would guess perhaps the fourth column (e.g. ENSMUSG00000042429, ENSMUSG00000025909) contains Ensembl Gene identifiers, right?

Are there possibly also some other gene identifiers (in other columns)?

Or is it perhaps so that this information is nowhere in GEO explicitly available?
If this is the case, perhaps then the only way to get explicit answers is to simply contact the corresponding author.

I am newbie to RNA-seq, so please forgive me if I am asking something trivial.

ADD REPLY

Login before adding your answer.

Traffic: 1782 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6