Question

Correlation network analysis

2

Entering edit mode

3.4 years ago

Lepomis_8 ▴ 30

I have a file of differentially expressed genes in a csv file. The data is already normalized as TPM. I have two treatments, with 12 samples. All up-regulated and down-regulated genes are included because I want to find reverse-correlated genes as well. Currently, the data is formatted in this manner:

Name Sample1 Sample2 Sample3 Sample4 Sample5 Sample6 Sample7...
gene1 5.6   12.0    0.0 0.5 0.8 0.6 0.0 0.0
gene 2 1.4  0.0 0.0 0.0 0.0 0.0 0.3 0.0
gene3 52.5  58.9    1.5 3.5 1.9 2.4 2.1 1.5
gene4 11.1  0.0 0.0 0.1 0.0 0.0 0.3 0.1
gene5 6.1   39.8    3.5 6.5 4.7 0.8 0.8 0.8
gene6 36.9  40.0    2.9 2.0 1.6 9.1 5.2 2.0
gene7 107.5 321.3 1.0   0.4 1.7 0.8 0.6 0.3

However, I have 4,300 genes. Most are mRNA, but some are long non-coding RNA. I am trying to use the cytoscape plugin ExpressionCorrelation to create a network and split the genes into clusters based on their correlation, but I'm not sure it's working correctly. Once I'm done, I'd take the different clusters and do functional annotation, and take the nodes with high levels of centrality and blast against transcription factor database.

After making gene network(preview histogram) I chose -0.8 and 0.8 as cutoffs. Then under tools I click on analyze network. Then, I use subnetwork creation (analyze connected components). When I do this, it splits into 8 subnetworks, but 1 of them contains 4,290 nodes! So it didn't do a good job of creating subnetworks.

Is there something I'm doing wrong? I haven't added any phenotypic data or anything, it's analyzing all samples together. Ideally I'd like to add layers for visualization like different color for lncRNA and mRNA but it's my understanding that this is not needed yet, since it should create an unbiased network.

I can provide additional info if it would help, please let me know!

rna-seq • 859 views

ADD COMMENT • link 3.4 years ago by Lepomis_8 ▴ 30

3

Entering edit mode

What is your objective ?

Are you trying to make clusters from expression data?

There are plugins like MCODE , ClusterViz, Clutermaker available in cytoscape to make the clusters.

Otherwise you can apply K-means algorithm to create desired clusters

ADD REPLY • link 3.4 years ago by DareDevil ★ 4.3k

0

Entering edit mode

Thank you!

Yes, I’m trying to cluster the genes by expression to try to find coexpressed or “related” genes. I’m basically trying to filter them into manageable groups, because trying to do target prediction now is impossible with this many genes.

If I want to compare expression data and then investigate colocalized genes (lncRNA and mRNA that are on the same chromosome and 1KB apart) what would be the best way to do that?

ADD REPLY • link 3.4 years ago by Lepomis_8 ▴ 30

3

Entering edit mode

I would suggest you can go for WGCNA to create the modules based on co-expression. You can explore it here

Then, perform an Gene Ontology for the clusters obtained from WGCNA.
Consider the GO terms that fits for your disease or conditions of interest.
Perform the downstream analysis on the genes filtered by GO terms

Note: Upvoting the answer is recommended if it has helped you

ADD REPLY • link 3.4 years ago by DareDevil ★ 4.3k

0

Entering edit mode

I have tried WGCNA, however I don't have any phenotypic data to associate with the samples, other than treated vs untreated.

ADD REPLY • link 3.3 years ago by Lepomis_8 ▴ 30