Question: How to find the relationship between gene expression data of various group?
gravatar for ishackm
4 months ago by
ishackm50 wrote:

Hi all,

I have a Geoquery Dataset, which has many groups (A,B,C,D,E). They all have gene expression data. How can I see the relationship between them on R?

Many Thanks,


expression R gene • 237 views
ADD COMMENTlink modified 4 months ago by Kevin Blighe41k • written 4 months ago by ishackm50
gravatar for Kevin Blighe
4 months ago by
Kevin Blighe41k
London, England
Kevin Blighe41k wrote:

Here are some practical steps to help you on your journey:

  1. From the main GEO accession page, click on the blue Analyze with GEO2R button, then click on the R script tab. There, you will find R code to download the normalised expression data as an Expression Set object
  2. Install / Load limma package
  3. With the metadata that you have got, you will have to build a model matrix that defines the groups to which the samples in the expression data relate
  4. Fit the model to the data and adjust the statistics by empirical Bayes method
  5. Conduct different pairwise comparisons and generate results tables

You will find many examples of this process on this and other websites, like Bioconductor support forum.

Finally, all of the information is in the limma manual


ADD COMMENTlink written 4 months ago by Kevin Blighe41k

Thanks a lot, Kevin. Is it also possible to visualise the relationship with this data also?

ADD REPLYlink written 4 months ago by ishackm50

Sure, there are many plots to explore:

  • histograms of expression data (the distribution should follow a 'bell curve')
  • hierarchical clustering and dendrogram of expression data (this allows you to look for 'unbiased' relationships between your samples / groups)
  • PCA bi-plots
  • MA plots (mean expression versus log [base 2] fold change)
  • Volcano plots (log [base 2] fold change versus negative log [base 10] p-value)
  • hierarchical clustering, dendrogram, and heatmap of your expression data filtered for statistically significant genes (this will help you to determine whether or not the selected genes genuinely segregate the samples/groups via clustering

Here, I am merely mentioning the terms that are commonly used, so that you can then search for these and be sure that they are the standard things to use in this field.

Here are some posts to get you started:

ADD REPLYlink written 4 months ago by Kevin Blighe41k

Thanks a lot, Kevin once again. Its much clearer now.

ADD REPLYlink written 4 months ago by ishackm50
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1054 users visited in the last hour