Question: How to find the relationship between gene expression data of various group?
gravatar for ishackm
4 weeks ago by
ishackm30 wrote:

Hi all,

I have a Geoquery Dataset, which has many groups (A,B,C,D,E). They all have gene expression data. How can I see the relationship between them on R?

Many Thanks,


expression R gene • 158 views
ADD COMMENTlink modified 4 weeks ago by Kevin Blighe35k • written 4 weeks ago by ishackm30
gravatar for Kevin Blighe
4 weeks ago by
Kevin Blighe35k
Republic of Ireland
Kevin Blighe35k wrote:

Here are some practical steps to help you on your journey:

  1. From the main GEO accession page, click on the blue Analyze with GEO2R button, then click on the R script tab. There, you will find R code to download the normalised expression data as an Expression Set object
  2. Install / Load limma package
  3. With the metadata that you have got, you will have to build a model matrix that defines the groups to which the samples in the expression data relate
  4. Fit the model to the data and adjust the statistics by empirical Bayes method
  5. Conduct different pairwise comparisons and generate results tables

You will find many examples of this process on this and other websites, like Bioconductor support forum.

Finally, all of the information is in the limma manual


ADD COMMENTlink written 4 weeks ago by Kevin Blighe35k

Thanks a lot, Kevin. Is it also possible to visualise the relationship with this data also?

ADD REPLYlink written 4 weeks ago by ishackm30

Sure, there are many plots to explore:

  • histograms of expression data (the distribution should follow a 'bell curve')
  • hierarchical clustering and dendrogram of expression data (this allows you to look for 'unbiased' relationships between your samples / groups)
  • PCA bi-plots
  • MA plots (mean expression versus log [base 2] fold change)
  • Volcano plots (log [base 2] fold change versus negative log [base 10] p-value)
  • hierarchical clustering, dendrogram, and heatmap of your expression data filtered for statistically significant genes (this will help you to determine whether or not the selected genes genuinely segregate the samples/groups via clustering

Here, I am merely mentioning the terms that are commonly used, so that you can then search for these and be sure that they are the standard things to use in this field.

Here are some posts to get you started:

ADD REPLYlink written 4 weeks ago by Kevin Blighe35k

Thanks a lot, Kevin once again. Its much clearer now.

ADD REPLYlink written 4 weeks ago by ishackm30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1184 users visited in the last hour