I am trying to run GSEA on my RNA-seq dataset using the tool provided by the Broad Institute, which I have downloaded from their webpage. I am using as input files my expression dataset (including all the genes, not only DEGs) Gene Symbols as identifiers, followed by the normalized counts for the samples. In addition to the expression dataset, I have generated a phenotype label .csl file as required by the tool.
In the "gene set database" I have selected the databases from h to c6 including only the ones with the ".all" definition, in order to avoid duplications. Also, in the "permutation type" I have selected "gene_set".
When I try to run the GSEA analysis, I am uncertain what to select in the "Collapse" option. If I select "No_Collapse", then I get the following error message:
After pruning, none of the gene sets passed size thresholds.
If I instead select "Collapse", it requires me to select a "ChIP platform" and I am very confused about what to select. Using Gene Symbols as identifiers in my expression dataset, I have tried to select "Human_Symbol_with_Remapping_MSigDB.v7.1.chip", but I get the following error:
The collapsed dataset was empty when used with chip:ftp.broadinstitute.org://pub...
Any help would be very appreciated!