Question: Gene set size effect on Gene ontology Semantic Similarity score
10 weeks ago
Hello everyone,

My name is Ravi and I am a doctoral student studying the biological processes in human ageing. Recently we wanted to also have a bioinformatic analysis of the same. I am trying to understand the effect gene set size has when I am computing the GO semantic similarity score using the R package 'GOSemSim'.

I have a fixed data set containing about 2000 genes, labelled TraitA.

I compute the semantic similarity between TraitA and several other traits, labelled Trait_Random. Trait_Random will have anywhere from 10 to 2000 genes.

How does this difference in gene set size affects the score that I get?

Also is there any statistical method that I could use if there is a bias in the score generated?

Any thoughts or inputs on this would be very helpful. Thank you so much for your time.

10 weeks ago
10 weeks ago
Guangchuang Yu1.4k
China/Hong Kong/The University of Hong Kong
should not have bias on gene set size. please refer to the vignette, which describe the calculation in details.

10 weeks ago by Guangchuang Yu
