Question

GSVA Error: "Less than two genes in the input expression data matrix"

0

Entering edit mode

4.8 years ago

ivykosater • 0

I have a matrix of gene expression data for multiple samples from TCGA which I converted into a matrix from a GCT file.

# Convert to R dataype

expr <- read.csv("BRCA_dataset2.gct", sep = "\t", header=FALSE)

# Clean up the data
expr <- expr[-2,]
expr <- expr[-1,]
expr <- janitor::row_to_names(expr, row_number = 1)

Next, I read in a list of genes for my gene set by doing the following

anti.apop <- read.delim(file.path("anti-apop singh.txt"))[, 1]

#Convert to list object
anti <- as.list(anti.apop)

Now when I try to run gsva, I am met with the following error

> gsva.out <- gsva(expr, anti)
Error in .local(expr, gset.idx.list, ...) : 
  Less than two genes in the input expression data matrix
In addition: There were 50 or more warnings (use warnings() to see the first 50)

Why does it say there are less than two genes in the expression data matrix? I went through and searched for the genes in the gene list in the expression matrix and they are all there. Why isn't GSVA recognizing it?

GSVA RNA-Seq R GSEA • 3.0k views

ADD COMMENT • link 4.8 years ago by ivykosater • 0

0

Entering edit mode

Is there a reason you're using read.csv with tab separated content instead of using read.table? Also, please show us the first few rows of expr.

ADD REPLY • link 4.8 years ago by Ram 45k

0

Entering edit mode

No particular reason. Here are the first few rows and columns of expr

 expr
      Name                        Description            TCGA-C8-A1HM-01         TCGA-D8-A1XQ-01        
4     "LINC02082"                 "ENSG00000242268.2"    "0.0"                   "0.0"                  
5     "AC090241.2"                "ENSG00000270112.3"    "0.0"                   "0.0"                  
6     "RAB4B"                     "ENSG00000167578.15"   "4.22031614394"         "3.0645541802499996"   
7     "ENSG00000273842"           "ENSG00000273842.1"    "0.0"                   "0.0"                  
8     "TIGAR"                     "ENSG00000078237.5"    "3.45460984296"         "3.20140628788"        
9     "RNF44"                     "ENSG00000146083.10"   "14.5327811455"         "6.7434959695500005"   
10    "NUP210P2"                  "ENSG00000225275.4"    "0.0"                   "0.0"

ADD REPLY • link 4.8 years ago by ivykosater • 0

0

Entering edit mode

In that case, I'd recommend using read.table. Code is already difficult to remember and comprehend, and choices such as these will add to the confusion.

What does anti.apop look like? What is dim(expr) once you're done cleaning up expr?

ADD REPLY • link 4.8 years ago by Ram 45k