Question: Looking to get PATTERN CLUSTERING of gene expression data with multiple time points
0
gravatar for Hushus
2.6 years ago by
Hushus 20
Hushus 20 wrote:

Hello all,

This is the 3rd thread concerning this issue and we have made steady progress. I sincerely appreciate the inputs of cpad0112 and h.mon for their efforts

Have: * RNA expression data for 12 time points, 1 replicate * List of genes of interest and their expression data http://m.uploadedit.com/bbtc/1512924884525.txt [Sample data, does not contain all 100 genes]

Want: Expression pattern clustering like this where x axis is time and y axis is relative expression (or something like it) enter image description here

Have tried: * Timecourse (R) which apparently does not accept if there is only 1 replicate [C: Cannot get result from TIMECOURSE (R PROGRAM)]

Question: Do you know of ANOTHER program that can achieve a similar result?

Possible solutions: 1) Take polynomial trend line of each line (gene) of expression data, cluster them accordingly manually. Problem: do not know how to get polynomial equation of each line of data using R.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Hushus 20
1
gravatar for dariober
2.6 years ago by
dariober11k
WCIP | Glasgow | UK
dariober11k wrote:

A simple way to go about it may be k-means clustering, like

dat<- read.table('http://m.uploadedit.com/bbtc/1512924884525.txt')
kmeans(dat, centers= 3)

To get a reasonable guess for the number of clusters you could look at the between_SS / total_SS statistic, start with a small number and increase until the between_SS / total_SS stops increasing sharply.

Another useful method may be non-negative matrix factorization as implemented in e.g. the NMF package. The vignettes of NMF are quite useful. An example could be:

library(NMF)
dat<- read.table('http://m.uploadedit.com/bbtc/1512924884525.txt')
xnmf<- nmf(dat, rank= 4) # Again, you need to "guess" the number of "pseudogenes"
xnmf@fit@H

The H matrix essentially gives you a small set of "pesudogenes" that together describe well the (much larger) full set of genes.

ADD COMMENTlink written 2.6 years ago by dariober11k

Oooo thank you for your detailed input. I will definitely try this out.

How would you plot the results?

ADD REPLYlink written 2.6 years ago by Hushus 20
0
gravatar for Hushus
2.6 years ago by
Hushus 20
Hushus 20 wrote:

ANSWER: DOWNLOAD MeV SOFTWARE USE K-MEANS CLUSTERING, PEARSON.

YOU GET THE FOLLOWING: enter image description here

GOOD ENOUGH FOR ME.

ADD COMMENTlink written 2.6 years ago by Hushus 20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1342 users visited in the last hour