Question: TCGA data analysis on R Studio -- result would be too long a vector
0
gravatar for freuv
15 months ago by
freuv20
freuv20 wrote:

Hi,

I am running the 'Preprocessing of Gene Expression data (IlluminaHiSeq_RNASeqV2)' and 'TCGAanalyze_SurvivalKM: Correlating gene expression and Survival Analysis' R-commands as-is from the Bioconductor page for TCGAbiolinks (http://bioconductor.org/packages/release/bioc/vignettes/TCGAbiolinks/inst/doc/analysis.html#tcgaanalyze_survivalkm:_correlating_gene_expression_and_survival_analysis)

However, I run into the following error when running this command (as-is, from the manual) in R-studio.

for( i in 1: round(nrow(dataBRCAcomplete)/100)){
    message( paste( i, "of ", round(nrow(dataBRCAcomplete)/100)))
    tokenStart <- tokenStop
    tokenStop <-100*i
    tabSurvKM<-TCGAanalyze_SurvivalKM(clinical_patient_Cancer,
                                      dataBRCAcomplete,
                                      Genelist = rownames(dataBRCAcomplete)[tokenStart:tokenStop],
                                      Survresult = F,
                                      ThreshTop=0.67,
                                      ThreshDown=0.33)

    tabSurvKMcomplete <- rbind(tabSurvKMcomplete,tabSurvKM)
}

Error: Error in 1:lastelementTOP : result would be too long a vector

Since I am using the example provided by Bioconductor, not sure what is the problem.

Any help would be much appreciated!

cancer rna-seq tcga R • 1.6k views
ADD COMMENTlink modified 12 months ago by Biostar ♦♦ 20 • written 15 months ago by freuv20

Have you additionally executed the following before the for loop:

clinical_patient_Cancer <- GDCquery_clinic("TCGA-BRCA","clinical")
dataBRCAcomplete <- log2(BRCA_rnaseqv2)

tokenStop<- 1

tabSurvKMcomplete <- NULL
ADD REPLYlink written 15 months ago by Kevin Blighe41k

Yes, I executed their example script as is.

ADD REPLYlink written 15 months ago by freuv20

Okay, how much free RAM have you got?; 32- or 64-bit machine?; R version?; operating system and version?

ADD REPLYlink written 15 months ago by Kevin Blighe41k

Hi Kevin, sorry for the late response -- did not see your message. I'm running RStudio on a Mac (Sierra), R version 3.4.3. 64-bit.

ADD REPLYlink written 15 months ago by freuv20

Maybe 2GB of free RAM?

ADD REPLYlink written 15 months ago by freuv20

May not be enough. I have 16GB RAM on my personal laptop. Can you try to reduce the size of the data and at least see if the code runs to completion?

ADD REPLYlink written 15 months ago by Kevin Blighe41k

Is there a way to split a matrix by nrows and write to n new matrices?

ADD REPLYlink written 15 months ago by freuv20

You could just take the first 500 rows as a test, like this:

matTest <- MyMatrix[1:500, ]
ADD REPLYlink written 15 months ago by Kevin Blighe41k

This is the output on the test.

1 of  5
0.2 of  5
97.96.95.94.93.92.91.90.89.88.87.86.85.84.83.82.81.80.79.78.77.76.75.74.73.72.71.70.69.68.67.66.65.64.63.62.61.60.59.58.57.56.55.54.53.52.51.50.49.48.47.46.45.44.43.42.41.40.39.38.37.36.35.34.33.32.31.30.29.28.27.26.25.24.23.22.21.20.19.18.17.16.15.14.13.12.11.10.9.8.7.6.5.4.3.2.1.0.3 of  5
96.95.94.93.92.91.90.89.88.87.86.85.84.83.82.81.80.79.78.77.76.75.74.73.72.71.70.69.68.67.66.65.64.63.62.61.60.59.58.57.56.55.54.53.52.51.50.49.48.47.46.45.44.43.42.41.40.39.38.37.36.35.34.33.32.31.30.29.28.27.26.25.24.23.22.21.20.19.18.17.16.15.14.13.12.11.10.9.8.7.6.5.4.3.2.1.0.4 of  5
100.99.98.97.96.95.94.93.92.91.90.89.88.87.86.85.84.83.82.81.80.79.78.77.76.75.74.73.72.71.70.69.68.67.66.65.64.63.62.61.60.59.58.57.56.55.54.53.52.51.50.49.48.47.46.45.44.43.42.41.40.39.38.37.36.35.34.33.32.31.30.29.28.27.26.25.24.23.22.21.20.19.18.17.16.15.14.13.12.11.10.9.8.7.6.5.4.3.2.1.0.5 of  5
95.94.93.92.91.90.89.88.87.86.85.84.83.82.81.80.79.78.77.76.75.74.73.72.71.70.69.68.67.66.65.64.63.62.61.60.59.58.57.56.55.54.53.52.51.50.49.48.47.46.45.44.43.42.41.40.39.38.37.36.35.34.33.32.31.30.29.28.27.26.25.24.23.22.21.20.19.18.17.16.15.14.13.12.11.10.9.8.7.6.5.4.3.2.1.0.>

Results in empty tabSurvKM and tabSurvKMcomplete tables

ADD REPLYlink modified 15 months ago by genomax65k • written 15 months ago by freuv20

I would contact the developers of the packages. In many situations, packages are not updated in new versions of R, and/or other dependency issues arise as new packages are released on Bioconductor without adequate testing. To further compound the problem, the TCGA consortium has been shifting their data around and one finds that links on Government-hosted websites (hosting the data) are broken.

I believe that the contact for TCGA biolinks is Tiago Silva in São Paulo, Brazil, where I frequently pass through.

ADD REPLYlink written 15 months ago by Kevin Blighe41k

Oh, just one, thing, please try it outside R Studio ('regular' R). I never use R Studio because it adds that one little extra thing to my analyses that could cause problems.

ADD REPLYlink written 15 months ago by Kevin Blighe41k
1

This is a good idea. Thanks for your input -- I will update progress here.

ADD REPLYlink written 15 months ago by freuv20

How did it go?

ADD REPLYlink written 14 months ago by Kevin Blighe41k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1576 users visited in the last hour