Combining microarray expression data
1
0
Entering edit mode
2.1 years ago
SnehaS • 0

Hello Fellow Scientists,

I have 5 microarray datasets (different platforms). Each dataset had disease and healthy samples. Few datasets had only 4 disease and 3 healthy samples while others had more. I wanted to run ML algorithms on them and since ML requires large number of samples, I was trying to find a way to combine these datasets. Here is what I did, and I would like to know whether this method is correct.

  1. I combined /concatenated expression matrices (gcrma / neqc normalized) of all of them into one by taking common genes measured. I had around 8000 genes as rows and 200 samples as columns.
  2. I used scale() function in R and converted expression values into z scores.
  3. I then used this z scores matrix and few gene signatures as an input for GSVA.
  4. The output for GSVA (gene signatures as rows, samples as columns, enrichment score values between -1 to 1) was used as an input for ML.

Is this method correct? What are some other ways to run ML algorithms on gene expression data? The goal for running ML is to find genes / gene signatures that separate disease from healthy.

Microarray GSVA Z-score • 604 views
ADD COMMENT
1
Entering edit mode
2.1 years ago

Hi SnehaS, I provide some generic guidance here: How to integrate multiple data sets from microarray platform prior meta-analysis?

Kevin

ADD COMMENT
0
Entering edit mode

Thank you Kevin

ADD REPLY
0
Entering edit mode

You are welcome, SnehaS

ADD REPLY

Login before adding your answer.

Traffic: 2615 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6