Question: GSVA Analysis Ouput Array gives lesser gene sets than original gene set array
0
gravatar for amit15013
29 days ago by
amit150130
amit150130 wrote:

Hi

I am doing gsva analysis on pathways gene sets which consists of 186 gene set. When i run gsva analysis on my dataset it returns only 168 gene sets. How do i get to know which gene set has been returned as there is not naming in the final output array.

Thank you

gsva • 104 views
ADD COMMENTlink written 29 days ago by amit150130

What software/package are you using?

ADD REPLYlink written 29 days ago by Pietro50

Ciao Pietro. S/He is using GSVA: https://bioconductor.org/packages/release/bioc/html/GSVA.html

amit15013, please show all of the commands that you are using.

ADD REPLYlink modified 29 days ago • written 29 days ago by Kevin Blighe41k

Hi

This is the code i am using

exprs <- as.matrix(read.table("~/velocyto/velocity_matrix_output.csv", header=TRUE, sep=",",row.names=1,as.is=TRUE))
gsva_set<- ExpressionSet(assayData=exprs)
our_genes<-GSA.read.gmt("~/velocyto/c2.cp.kegg.v6.2.symbols.gmt")
gsva_analysis <- gsva(gsva_set@assayData$exprs,our_genes$genesets,min.sz=0, max.sz=500, verbose=TRUE)

This returns only 168 gene sets and not 186. Also it doesn't return the names of the gene sets which are processed.

Thank you

ADD REPLYlink modified 29 days ago by Kevin Blighe41k • written 29 days ago by amit150130

our_genes should be a named list, right? - are the names assigned?

Even though you have min.sz=0, I think that it is possible that GSVA will not return gene sets that do not have any match genes - not 100% sure, though.

ADD REPLYlink written 29 days ago by Kevin Blighe41k

Yes, our_genes prints the names of the pathways as well as the genes present in it.

True, even though it has min.sz=0 it is returnig only 168 gene set and not 186.

Thank you

ADD REPLYlink written 29 days ago by amit150130

Best thing, then, is to infer the ones that are not returned and then investigate why. Not sure about the names issued - I recently used my own curated gene set with GSVA and it returned names.

ADD REPLYlink written 29 days ago by Kevin Blighe41k

can you please show your code. It would be really helpful

ADD REPLYlink written 29 days ago by amit150130

Hey, I had to wait until I got home. I just checked and, indeed, the names are being carried through. Take a look:

names(genesets)
 [1] "C1_mean_UMI"  "C2_mean_UMI"  "C3_mean_UMI"  "C4_mean_UMI"  "C5_mean_UMI" 
 [6] "C6_mean_UMI"  "C7_mean_UMI"  "C8_mean_UMI"  "C9_mean_UMI"  "C10_mean_UMI"
 ...

topMatrixGSVA <- gsva(vstMatrix, genesets, method="gsva", min.sz=5, max.sz=999999, abs.ranking=FALSE, verbose=TRUE)
Estimating GSVA scores for 29 gene sets.
Computing observed enrichment scores
Estimating ECDFs with Gaussian kernels
Using parallel with 4 cores
  |=============================================================         |  88%

rownames(topMatrixGSVA)
 [1] "C1_mean_UMI"  "C2_mean_UMI"  "C3_mean_UMI"  "C4_mean_UMI"  "C5_mean_UMI" 
 [6] "C6_mean_UMI"  "C7_mean_UMI"  "C8_mean_UMI"  "C9_mean_UMI"  "C10_mean_UMI"
 ...

Are you sure that your gene-sets are in a named list? Which version of GSVA are you using? - I have GSVA_1.30.0

ADD REPLYlink written 28 days ago by Kevin Blighe41k

Maybe some gene sets are larger than 500 genes (max.sz=500)?

ADD REPLYlink written 29 days ago by h.mon24k

No i checked that. Max size is 93.

ADD REPLYlink written 28 days ago by amit150130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 825 users visited in the last hour