Question

How to find the relationship between gene expression data of various group?

0

Entering edit mode

5.6 years ago

ishackm ▴ 110

Hi all,

I have a Geoquery Dataset, which has many groups (A,B,C,D,E). They all have gene expression data. How can I see the relationship between them on R?

Many Thanks,

Ish

R gene expression • 1.1k views

ADD COMMENT • link updated 5.6 years ago by Kevin Blighe 88k • written 5.6 years ago by ishackm ▴ 110

score 0 · Answer 1 · 2018-12-15

0

Entering edit mode

5.6 years ago

Kevin Blighe 88k

Here are some practical steps to help you on your journey:

From the main GEO accession page, click on the blue Analyze with GEO2R button, then click on the R script tab. There, you will find R code to download the normalised expression data as an Expression Set object
Install / Load limma package
With the metadata that you have got, you will have to build a model matrix that defines the groups to which the samples in the expression data relate
Fit the model to the data and adjust the statistics by empirical Bayes method
Conduct different pairwise comparisons and generate results tables

You will find many examples of this process on this and other websites, like Bioconductor support forum.

Finally, all of the information is in the limma manual

Kevin

ADD COMMENT • link 5.6 years ago by Kevin Blighe 88k

0

Entering edit mode

Thanks a lot, Kevin. Is it also possible to visualise the relationship with this data also?

ADD REPLY • link 5.6 years ago by ishackm ▴ 110

1

Entering edit mode

Sure, there are many plots to explore:

histograms of expression data (the distribution should follow a 'bell curve')
hierarchical clustering and dendrogram of expression data (this allows you to look for 'unbiased' relationships between your samples / groups)
PCA bi-plots
MA plots (mean expression versus log [base 2] fold change)
Volcano plots (log [base 2] fold change versus negative log [base 10] p-value)
hierarchical clustering, dendrogram, and heatmap of your expression data filtered for statistically significant genes (this will help you to determine whether or not the selected genes genuinely segregate the samples/groups via clustering

Here, I am merely mentioning the terms that are commonly used, so that you can then search for these and be sure that they are the standard things to use in this field.

Here are some posts to get you started:

ADD REPLY • link 5.6 years ago by Kevin Blighe 88k

1

Entering edit mode

Thanks a lot, Kevin once again. Its much clearer now.

ADD REPLY • link 5.6 years ago by ishackm ▴ 110