Question: A good pipeline for GO term analysis on RNA-seq Clusters in R?
1
gravatar for dtatarak
20 months ago by
dtatarak10
dtatarak10 wrote:

Hi all,

I'm a relative newcomer to RNA-seq analysis, and I am now at the point where I want to do GO term analysis on my dataset.

I have done hierarchical clustering of my dataset consisting of 800 differentially expressed genes from Zebrafish samples. I have identified clusters that are interesting based on their expression patterns, and I now want to look at the gene ontology within these clusters.

I have looked at several R packages for GO term analysis online including clusterProfiler and GOexpress. But the documentation leaves something to be desired for an R newbie like myself. Does anyone have a suggestion for a GO term analysis pipeline they have used in R? Thank you very much!

Best David Tatarakis

rna-seq go terms R • 2.4k views
ADD COMMENTlink modified 20 months ago by caggtaagtat930 • written 20 months ago by dtatarak10

Could you please share your RNA-Seq pipeline with commands (DSeq2 and Hierarchical clustering with me? I am also a newcomer and trying to analyze the RNA-Seq data from zebrafish. Thank you in advance.

ADD REPLYlink written 20 months ago by rminhas0
2
gravatar for Kevin Blighe
20 months ago by
Kevin Blighe54k
Kevin Blighe54k wrote:

Coincidence, but my recommendation for you, David, is to use DAVID. That's an acronym for Database for Annotation, Visualization and Integrated Discovery. It is quite possibly the easiest tool to use for someone just starting out wih gene enrichment. To help, I've even shown how one can do enrichment in my tutorial here: Clustering of DAVID gene enrichment results from gene expression studies

There are many other tools out there,. but their implementation can be tricky due to annotation issues. With DAVID, you can have your genes in various annotation formats, as you'll see, and it will even attempt to automatically identify the annotation format for you, if you wish.

Kevin

ADD COMMENTlink modified 3 days ago • written 20 months ago by Kevin Blighe54k

DAVID is definitely good, but is there a way to present the results graphically instead of the standard tables?

ADD REPLYlink written 4 days ago by Arindam Ghosh200

I had a tutorial on Biostars previously, specifically for how to plot the results of DAVID as a heatmap; however, as new package versions were released, the tutorial fell into disrepair.

Essentially, you could create a gene X GO term [or KEGG pathway, etc] binary matrix, and shade cells in the heatmap white for 0, and green or any other colour for 1.

ADD REPLYlink written 3 days ago by Kevin Blighe54k

I used PANTHER to identify enriched biological processes in ~4000 genes and obtained ~500 GO BP complete terms after FDR < 0.05. This I guess would not be good with heatmaps. REVIGO treemaps helped reduce the redundancy though and helped make a good figure for publication.

ADD REPLYlink written 3 days ago by Arindam Ghosh200
1

You could just plot the top 20 as a barplot based on -log10(FDR) ?

Here, I use base R: A: DAVID functional Analysis and its visualization of GO terms using Bar plot

Using ggplot2 would be nicer, though

ADD REPLYlink modified 3 days ago • written 3 days ago by Kevin Blighe54k
1
gravatar for Zhilong Jia
20 months ago by
Zhilong Jia1.5k
London
Zhilong Jia1.5k wrote:
  1. the toppcluster webserver, but not work recently.
  2. co-expressed gene set enrichment analysis, cogena. But Zebrafish GO gene sets as a gmt file are needed.
  3. clusterProfiler. GO analyses (groupGO(), enrichGO() and gseGO()) support organisms that have an OrgDb object available. so it supports zebrafish. ref: http://bioconductor.org/packages/release/bioc/vignettes/clusterProfiler/inst/doc/clusterProfiler.html

In summary. clusterProfiler probably is the easiest way if you program. Or use DAVID webserver as recommended by @Kevin by analyzing per cluster each time.

Another relative post: C: Compare sets of GO enrichments

ADD COMMENTlink written 20 months ago by Zhilong Jia1.5k
1
gravatar for caggtaagtat
20 months ago by
caggtaagtat930
caggtaagtat930 wrote:

For gene set enrichment analysis (GSEA), I use the R package "EGSEA". It combines 12 prominent GSEA algorithms availible for R and obtains a consensus ranking of biologically relevant results.

The results can than be used for REVIGO for example, to visualize changes of GO families.

ADD COMMENTlink written 20 months ago by caggtaagtat930
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1167 users visited in the last hour