problem with RUVSeq and SVA
1
1
Entering edit mode
22 months ago
Will Yam ▴ 30

Hello,

I'm trying to remove unwanted variables using RUVSeq and SVA but i'm meeting an issue with both libraries. with RUVSeq I get this "list" error:

> for(k in 1:4) {
+   set_g <- RUVg(x = gg, cIdx = house_keeping_genes, k = k)
+   DESeq2::plotPCA(set_g, col=as.numeric(coldata$group))
+ }

Error in Ycenter[, cIdx] : invalid subscript type 'list'

while with SVA I get this:

> res <- svaseq(norm_by_count_per_million, mod, mod0)

Number of significant surrogate variables is: 46
Iteration (out of 5 ):Error in density.default(x, adjust = adj) : 'x' contains missing values

I've found a couple of references but haven't been able to fix these issues. Full reproducible example bellow, any help would be much appreciated!

library(SummarizedExperiment)
library(RUVSeq)
library(sva)

# http://duffel.rail.bio/recount/v2/TCGA/rse_gene_bladder.Rdata
load('data/rse_gene_bladder.Rdata')

unique(rse_gene$gdc_cases.diagnoses.tumor_stage)

colData(rse_gene)$group <- NA
rse_gene$gdc_cases.diagnoses.tumor_stage == "stage iv"
colData(rse_gene)$group[rse_gene$gdc_cases.diagnoses.tumor_stage == "stage i"] <- "early"
colData(rse_gene)$group[rse_gene$gdc_cases.diagnoses.tumor_stage == "stage ii"] <- "early"
colData(rse_gene)$group[rse_gene$gdc_cases.diagnoses.tumor_stage == "stage iii"] <- "late"
colData(rse_gene)$group[rse_gene$gdc_cases.diagnoses.tumor_stage == "stage iii"] <- "late"

keep <- !is.na(rse_gene$group)
rse_gene <- rse_gene[, keep]
table(rse_gene$group)

counts <- assay(rse_gene, "counts")
norm_by_count_per_million <- sweep(counts, 2, FUN="/", colSums(counts)) * 10^6

coldata <- colData(rse_gene)

HK_genes <- read.table(url("https://m.tau.ac.il/~elieis/HKG/HK_genes.txt"))
house_keeping_genes <- intersect(rowRanges(rse_gene)$symbol, HK_genes$V1)
gg <- ceiling(norm_by_count_per_million)
rownames(gg) <- as.vector(rowRanges(rse_gene)$symbol)

par(mfrow = c(2, 2))
for(k in 1:4) {
  set_g <- RUVg(x = gg, cIdx = house_keeping_genes, k = k)
  DESeq2::plotPCA(set_g, col=as.numeric(coldata$group) )
}

mod <- model.matrix(~group, data=coldata)
mod0 <- model.matrix(~1, data=coldata)

res <- svaseq(norm_by_count_per_million, mod, mod0)
rnaseq • 1.4k views
ADD COMMENT
2
Entering edit mode
22 months ago
ATpoint 82k

The ?RUVg help:

cIdx
A character, logical, or numeric vector indicating the subset of genes to be used as negative controls in the estimation of the factors of unwanted variation.

But class(house_keeping_genes) returns list, hence as.character(house_keeping_genes) should work.

I would recommend normalizing data with DESeq2 or edgeR, not with this plain per-million scaling. Here is a full DESeq2+RUVseq analysis from the DESeq2 author: https://github.com/mikelove/preNivolumabOnNivolumab/blob/main/preNivolumabOnNivolumab.knit.md

ADD COMMENT
0
Entering edit mode

Hey, thanks for the link, I'm going through it.
Regarding the list mistake, it's one step but still doesn't allow the code to run, I'll keep digging..

for(k in 1:4) {
   set_g <- RUVg(x = gg, cIdx = as.character(house_keeping_genes), k = k)
   DESeq2::plotPCA(set_g, col=as.numeric(coldata$group), cex = 0.9, adj = 0.5,
                   main = paste0('with RUVg, k = ',k),
                   ylim = c(-1, 1), xlim = c(-1, 1), )
 }  

Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘plotPCA’ for signature ‘"list"’

ADD REPLY
0
Entering edit mode

Read the help mage of the function. It needs a DESeqTransform object and not the direct output of ruvg.

ADD REPLY
1
Entering edit mode

Hey, ultimately i was able to follow the example analysis you provided and adapt it to my problem. thanks a lot! :)

ADD REPLY

Login before adding your answer.

Traffic: 2532 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6