Question

K-means clustering of differentially expressed genes based on GO

1

Entering edit mode

9.4 years ago

Atefeh Mahdavi ▴ 10

Hi everybody!

I have transcript clusters (hierarchical clustering) of differentially expressed genes generated by Trinity pipeline. But, I don't have any information about the GO of genes in each cluster or even knowing how many genes belong to every cluster.

Therefore, I wonder if anybody has idea how can I do K-means clustering based on GO of differentially expressed genes.

Really appreciate your help.

next-gen RNA-Seq • 4.4k views

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by Atefeh Mahdavi ▴ 10

1

Entering edit mode

Do you have go annotation of your transcriptome assembly at all? If not, you need to run an annotation pipeline first. See Annotating sequences after de-novo Trinity assembly and RSEM analysis...there must be an easier way! or maybe Transcriptome Analysis with only a fasta file about GO annotation first.

You can't do k-means clustering of go terms, because there is no euclidean metric for go terms (DAG is not a vector space, what's the centroid of "hydrolysis" and "DNA-repair"?).

See instead: Clustering Go Terms? or Clustering Genes Based On Gene Ontology and ftp://ftp.geneontology.org/go/www/GO.tools.microarray.shtml

ADD REPLY • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by Michael 54k

Ram · Answer 1 · 2014-12-22

2

Entering edit mode

9.4 years ago

Michael 54k

The Bioconductor package GOsim contains methods for comparing sets of genes based on their functional annotation

e.g. getGeneSim to calculate distance of gene sets based on their functional annotation or clusterEvaluation to compute cluster quality scores for existing clusters (e.g. from k-means).

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by Michael 54k

0

Entering edit mode

Thanks indeed for your reply and help Michael!

I have GO annotation results generated through Trinitate. However, I am also running Blast2Go now.

Is there any other easier bioconductor ! I don't know anything about R programming... I found DNAstar software quite friendly for those who doesn't know programming but we should buy it.

Merry Christmas and happy new year!

ADD REPLY • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by Atefeh Mahdavi ▴ 10