Question: Extracting important gene with similar expression level from multiple samples
0
gravatar for bharata1803
4 weeks ago by
bharata1803430
Japan
bharata1803430 wrote:

Hello all,

So, after collecting several samples of iPSC (RNA-seq) data from independent NCBI GEO dataset, I think not all iPSC has similar gene expression profile. I also read from paper that gene expression of iPSC may varied.

In that case, I collect 4-5 iPSC samples from multiple NCBI GEO dataset. They are all obtained using RNA-seq experiment for reprogramming iPSC and independent to each other.

My goal is simple, extracting similar genes in these different and independent samples. My hypotheses is even with different states of iPSC, there will be an underlying similar mechanism which can be shown from gene expression level. By obtaining those similar gene, we can conclude that those gene would be important in giving pluripotency characteristic of iPSC and other gene expression level that are varied among samples would not be important.

What kind of feature extraction would be useful to obtained these genes?

One of the method that I can think of is to find distance for each gene from every possible pairwise combination. For example there are 3 samples, A,B,C and gene G. I will find distance for gene G from A vs B, A vs C, and B vs C. If the distance is small, gene G is selected.

ADD COMMENTlink modified 4 weeks ago by piyushjo170 • written 4 weeks ago by bharata1803430
0
gravatar for The
4 weeks ago by
The160
United States
The160 wrote:

How about selecting genes with least variance across samples? Otherwise some clustering will give you genes with similar expressions

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by The160
0
gravatar for piyushjo
4 weeks ago by
piyushjo170
piyushjo170 wrote:

You can try WGCNA. It will help you select non-linearly correlated genes. I am just concerned with such a small sample cohort you have.

https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/Rpackages/WGCNA/

For your second analysis you can use mutual information based analysis. For that you can use mrnet package or arcane or mutual information plugin in cytoscape.

http://apps.cytoscape.org/apps/cynitoolbox

http://apps.cytoscape.org/apps/aracne

ADD COMMENTlink written 4 weeks ago by piyushjo170

I am familiar with WGCNA, which function that you refer? I only know WGCNA for building network from coexpression network. My cohort now is about 5 datasets. Each datasets has more than 3 ipsc samples. So, I think it is quite enough. One of them also come from different ceell type. Most of them are ipsc from fibroblast and one of them is from cord blood. If my hypotheses correct, even if ipsc come from different cell tpe, it will still shows some similar gene expression in pluripotency functionality.

ADD REPLYlink written 27 days ago by bharata1803430

Follow the example in the link below. It will help you get correlated gene cohorts and also do some trait relationships. There are also tutorials on the first link ucla.edu), but I find the tutorial below much easier to understand.

https://wikis.utexas.edu/display/bioiteam/Clustering+using+WGCNA

ADD REPLYlink written 27 days ago by piyushjo170
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 795 users visited in the last hour