Question: Finding genes differently expressed in time series data
0
gravatar for F
3 months ago by
F3.0k
Iran
F3.0k wrote:

Hi,

I have Control cell line day1 to day 26 and Lead(pb) treatment day1 to day26. The experimental design aims to get the impact of Lead(pb) on developmental process .I must look if there are group of developmental genes that cluster differently in the lead(pb) experiment. For instance some genes are differentially expressed directly or others are coming in later time points. First I thought about WGCNA (figure) https://ibb.co/j9O5YS in which I think genes in significant modules in days 1-3 make sense for me. Then I thought about SOM clustering but clusters don't show any up or down regulation in day points

https://ibb.co/jEBW7n

Any help please?

rna-seq R gene • 298 views
ADD COMMENTlink written 3 months ago by F3.0k
1

It depends on which definition of cluster you're interested in. For time series, I would compute a distance matrix based on dynamic time warping (see for example the R package dtw, be sure to read the FAQ down the page) and use this matrix as input for a clustering algorithm.

ADD REPLYlink written 3 months ago by Jean-Karim Heriche15k

Thanks a lot, for example in picture A (cluster 28 with 103 genes), genes toward week 9 are being up regulated and in cluster 3 with 203 genes to the end of development down regulated.

https://ibb.co/h7N2yS

ADD REPLYlink modified 3 months ago • written 3 months ago by F3.0k
1

What I had in mind was for you to compute a distance matrix using dtw for each condition separately then use this for simultaneous clustering using for example a tensor factorization method as I wrote in a recent answer which you've seen.

ADD REPLYlink written 3 months ago by Jean-Karim Heriche15k

Thanks a lot for sharing your idea. I have 52 samples (without replicates); control_day1 vs lead_day1 to control_day26 vs lead_day26. Then by dtw I must calculate 26 distance matrices for each control_day vs lead_day?????

Thanks again to help people to figure out

ADD REPLYlink written 3 months ago by F3.0k

Sorry, this is my control and treatment matrices

> head(control[,1:4])
               MAST2     WWC2  PHYHIPL   R3HDM2
Control_D1  6.591024 5.695156 3.388652 5.756384
Control_D10 8.043454 5.365221 6.859768 6.936970
Control_D11 7.731590 4.868267 6.919972 6.931073
Control_D12 8.129948 5.105528 6.627016 7.090268
Control_D13 7.690863 4.729501 6.824746 6.904610
Control_D14 8.101723 5.334501 6.868990 7.115883
> 

> head(lead[,1:4])
              MAST2     WWC2  PHYHIPL   R3HDM2
Lead30_D1  6.418423 5.610699 3.734425 5.778046
Lead30_D10 7.918360 4.295191 6.559294 6.780952
Lead30_D11 7.807142 4.294722 6.599187 6.716040
Lead30_D12 7.856720 4.432136 6.572337 6.848483
Lead30_D13 7.827311 4.204738 6.607107 6.784094
Lead30_D14 7.848760 4.458451 6.581216 6.943003
>

library(dtw)

Control=dist(control,method="DTW");

Lead=dist(lead,method="DTW");

I googled too much but too confusing, is this distance matrices by dtw you suggested for tensor factorization ??

ADD REPLYlink written 11 weeks ago by F3.0k

Don't you get an error ? dist() is just the standard distance function in R, it doesn't know anything about dtw. Start by reading the dtw doc. What you need to do is apply the dtw() function to each pair of items for which you want the dtw distance and extract the distance from the resulting data structure e.g.

  for(i ...) {
    for(j ...) {
      alignment <- dtw(item[i],item[j])
      distance[i, j] <- alignment$distance
    }
}

You may need to adjust to your needs, maybe use a symmetric version or the normalized distance. Check the doc.

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by Jean-Karim Heriche15k

Thank you, since creating this post I am googling for dtw but seems to complex. I did not get error and gives me a class 'dist' atomic

you mean I should put control and lead in this code?

for(i in control) {
    for(j in lead) {
      alignment <- dtw(item[i],item[j])
      distance[i, j] <- alignment$distance
    }
}

That gave me

Error in dtw(item[i], item[j]) : object 'item' not found

and this

for (i in row.names(control)) { 
  for (j in row.names(lead)) { 
    result[i,j] <- dtw( dist(x[,,i],x[,,j]), distance.only=T )$normalizedDistance }}

Gave

Error in x[, , j] : incorrect number of dimensions
ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by F3.0k

item was just an example variable. Replace by what's appropriate in your case (control and lead maybe).

ADD REPLYlink written 11 weeks ago by Jean-Karim Heriche15k

Sorry, Finally I was able to run your tutorial on my data.

I used two separate weighted adjacency matrices of control and treatment data (each 26 samples and 50 genes and I used genes as nodes).

my output matrices are not identical rather I can't understand how use the output to find which parts of the treatment-responses are specific to the treatment network?

https://ibb.co/f6DWen

 [[3]]
              [,1]       [,2]       [,3]       [,4]
    [1,] 0.5466689 -0.5357849 -0.4886675 -0.4788609
    [2,] 0.4533311 -0.4642151 -0.5113325 -0.5211391

How can I use this matrix to find responsive genes? Thank you for your time

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by F3.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 619 users visited in the last hour