Question: What Is The Best Way To Import Geo Files Into Tm4/Mev (Extended)
9.0 years ago
Lyco wrote:

Well, the title says it all. I am aware of the fact that there are different platforms in GEO, which use different file formats. For simplicity, let's stick with the GSExxx files of the Affy platform(s), which are the most frequent.

At the moment, I download the series matrix files, import them into Excel, identy a row that would make a decent header (sample description), remove the tons of other headers rows, save everything as a text file, and voila. I have the feeling that there should be are better way to accomplish that.

I have asked on the MeV forum but did not get a satisfactory answer. maybe I'll get lucky here.

Edit: Maybe I should ask more precisely. I am well aware of the possibility to import generic tab-delimited files and to specify where the data begins. However, I don't consider this soluttion satisfactory, as it leaves me with 77 different header types, none of which is really suited for display at the top of the heatmap columns. (T)MeV does offer 4 loaders specifically dedicated to importing GEO data, but none of them appears practical:

For the typical Affy project, GEO offers the data in 3 formats: SOFT format, MINIML format and 'Series Matrix' format.

  • Two of MevS GEO import filters require 'GPL family format', which is not provided by GEO (or is it?)

  • One filter imports 'Series Matrix' files, which are provided. However, when using this filter, the experiments have no meaningful names but are just called 'GSM12345". The real sample description (which is written somewhere in the series matrix file) is ignored. Moreover, The probes are not identified by gene names or Affy-Ids, but just have numbers 1,2,3... Not very helpful.

  • The last filter requires 'GDS format files' which are also not provided by GEO (or which I am not able to find).

There is a MeV documenation file (MeV47_0/documentation/manual/Loading-geo-data.html) which recommends 'After Expression File Loader dialog is launched, SOFT Affymetrix format file can be loaded by selecting the GEO SOFT Affymetrix file loader option from the list of available file formats to load', but again, I am not able to find the SOFT file loader in MeV 4.7 (or any of the previous versions)

9.0 years ago
Istvan Albert
University Park, USA
Istvan Albert wrote:

TM4 allows text file import format in which you can specify where the first data row is located. See the instructions at the bottom (red color):

written 9.0 years ago by Istvan Albert

Thank you. I noticed that I did not explain my question well enough (I have now edited it to make it clearer) I am aware of this possibility, but it does not work for GEO files. The first lines contain only data in the first two fields, which makes the importer assume that all rows have only two fields (this may depend on the GEO project, but I tried it with several Affy GSE projects

written 9.0 years ago by Lyco

In that case I think your best option is to create a simple script that transforms a file to your desired format (you could also ask here by posting the two relevant bits of these formats source -> target).

written 9.0 years ago by Istvan Albert
