Question

Problem in using TCGAvisualize_meanMethylation function

0

Entering edit mode

6.1 years ago

mahmood.amiri225 • 0

Hi, I used ‘TCGAbiolinks’ version 2.6.12. I have just tried to plot using TCGAvisualize_meanMethylation function, using flowing codes, but the R script didn't work well and flowing error came out:

R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> #----------------------------
> # Obtaining DNA methylation
> #----------------------------
> setwd("F:/worlshop_EWAS/New/TCGA/Matched_ex_met")
> library(TCGAbiolinks)
> library(stringr)
> matched_met_exp <- function(project, n = NULL){
+ # get primary solid tumor samples: DNA methylation
+ message("Download DNA methylation information")
+ met450k <- GDCquery(project = project,
+                     data.category = "DNA methylation",
+                     platform = "Illumina Human Methylation 450",
+                     legacy = TRUE,
+                     sample.type = c("Primary solid Tumor"))
+ met450k.tp <- met450k$results[[1]]$cases
+ # get primary solid tumor samples: RNAseq
+ message("Download gene expression information")
+ exp <- GDCquery(project = project,
+                 data.category = "Gene expression",
+                 data.type = "Gene expression quantification",
+                 platform = "Illumina HiSeq",
+                 file.type = "results",
+                 sample.type = c("Primary solid Tumor"),
+ legacy = TRUE)
+ exp.tp <- exp$results[[1]]$cases
+ printexp.tp[1:10])
+ # Get patients with samples in both platforms
+ patients <- unique(substrexp.tp,1,15)[substrexp.tp,1,12) %in% substrmet450k.tp,1,12)])
+ if(!is.null(n)) patients <- patients[1:n] # get only n samples
+ return(patients)
+ }
> lgg.samples <- matched_met_exp("TCGA-LGG", n = 10)
Download DNA methylation information
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-LGG
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By sample.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
Download gene expression information
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-LGG
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By data.type
ooo By file.type
ooo By sample.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
 [1] "TCGA-HT-7480-01A-11R-2090-07" "TCGA-QH-A6XC-01A-12R-A32Q-07"
 [3] "TCGA-HT-A614-01A-11R-A29R-07" "TCGA-HT-8104-01A-11R-2404-07"
 [5] "TCGA-TQ-A7RG-01A-11R-A33Z-07" "TCGA-HW-7487-01A-11R-2027-07"
 [7] "TCGA-FG-8186-01A-11R-2256-07" "TCGA-P5-A5EX-01A-12R-A28M-07"
 [9] "TCGA-CS-6667-01A-12R-2027-07" "TCGA-DU-7304-01A-12R-2090-07"
> gbm.samples <- matched_met_exp("TCGA-GBM", n = 10)
Download DNA methylation information
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-GBM
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By sample.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
Download gene expression information
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-GBM
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By data.type
ooo By file.type
ooo By sample.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
 [1] "TCGA-06-0184-01A-01R-1849-01" "TCGA-06-0649-01B-01R-1849-01"
 [3] "TCGA-02-2485-01A-01R-1849-01" "TCGA-26-5136-01B-01R-1850-01"
 [5] "TCGA-76-4925-01A-01R-1850-01" "TCGA-32-2632-01A-01R-1850-01"
 [7] "TCGA-06-5858-01A-01R-1849-01" "TCGA-28-5213-01A-01R-1850-01"
 [9] "TCGA-06-0157-01A-01R-1849-01" "TCGA-16-0846-01A-01R-1850-01"
> samples <- c(lgg.samples,gbm.samples)
> query.lgg <- GDCquery(project = "TCGA-LGG",
+                       data.category = "DNA methylation",
+                       platform = "Illumina Human Methylation 450",
+                       legacy = TRUE, barcode = lgg.samples)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-LGG
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By barcode
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
> met.lgg <-GDCprepare(query.lgg, save = FALSE)
  |====================================================================================| 100%Downloading genome information (try:0) Using: Homo sapiens genes (GRCh37.p13)
Loading from disk
Starting to add information to samples
 => Add clinical information to samples
Add FFPE information. More information at: 
=> https://cancergenome.nih.gov/cancersselected/biospeccriteria 
=> http://gdac.broadinstitute.org/runs/sampleReports/latest/FPPP_FFPE_Cases.html
 => Adding subtype information to samples
lgg subtype information from:doi:10.1016/j.cell.2015.12.028
> query.gbm <- GDCquery(project = "TCGA-GBM",
+                       data.category = "DNA methylation",
+                       platform = "Illumina Human Methylation 450",
+                       legacy = TRUE, barcode = gbm.samples)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-GBM
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By barcode
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
> met.gbm <- GDCprepare(query.gbm, save = FALSE)
  |====================================================================================| 100%Downloading genome information (try:0) Using: Homo sapiens genes (GRCh37.p13)
Loading from disk
Starting to add information to samples
 => Add clinical information to samples
Add FFPE information. More information at: 
=> https://cancergenome.nih.gov/cancersselected/biospeccriteria 
=> http://gdac.broadinstitute.org/runs/sampleReports/latest/FPPP_FFPE_Cases.html
 => Adding subtype information to samples
gbm subtype information from:doi:10.1016/j.cell.2015.12.028
> met <- SummarizedExperiment::cbind(met.lgg, met.gbm)
> met <- subset(met,subset = rowSumsis.na(assay(met))) == 0))
Error in assay(met) : could not find function "assay"
> # remove probes in chromossomes X, Y and NA
> met <- subset(met,subset = !as.character(seqnames(met)) %in% c("chrNA","chrX","chrY"))
Error in seqnames(met) : could not find function "seqnames"
> TCGAvisualize_meanMethylation(met,
+                                  groupCol = "disease_type",
+                                  group.legend = "Groups",
+                                  filename = "mean_lgg_gbm.png",
+                                  print.pvalue = TRUE)
==================== DATA Summary ====================
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
  'x' must be atomic

if you know the solution, would you tell me , thanks in advance !

R • 2.5k views

ADD COMMENT • link updated 5.4 years ago by 1303823432 • 0 • written 6.1 years ago by mahmood.amiri225 • 0

0

Entering edit mode

Page27--page28;https://f1000researchdata.s3.amazonaws.com/manuscripts/11063/744c5428-15a5-4c3e-bc06-23727829c012_8923_-_houtan_noushmehr_v2.pdf?doi=10.12688/f1000research.8923.2&numberOfBrowsableCollections=14&numberOfBrowsableGateways=23

ADD REPLY • link 5.4 years ago by 1303823432 • 0

0

Entering edit mode

Thank you. Unfortunately, this is an error for the developers of the program. However, they do not seem to be doing any further work on TCGAbiolinks. Take a look at the GitHub issues page: https://github.com/BioinformaticsFMRP/TCGAbiolinks/issues

You could obtain the methylation data from GDC direct, and analyse it that way.

ADD REPLY • link 5.4 years ago by Kevin Blighe 87k

0

Entering edit mode

Ok, I need use other packages. Do you have other workflow files to analyze methylation? Thank you.

ADD REPLY • link 5.4 years ago by 1303823432 • 0

0

Entering edit mode

Hi, friend. I solve this problem. By changing the last line of code: TCGAvisualize_meanMethylation(met, groupCol = "name", group.legend = "Groups", filename = "mean_lgg_gbm.png", print.pvalue = TRUE)

The picture obtained seems to be the same. I modified the parameter：groupCol，because that I was inspired by the results of the code：

str(met)
.. .. .. ..$ name : chr [1:20] "Brain Lower Grade Glioma" "Brain Lower Grade Glioma" "Brain Lower Grade Glioma" "Brain Lower Grade Glioma" ...

Do you think that it is feasible to do so?

ADD REPLY • link updated 5.4 years ago by Kevin Blighe 87k • written 5.4 years ago by 1303823432 • 0

0

Entering edit mode

The only question is that I get a different p-value. I am confused.

ADD REPLY • link 5.4 years ago by 1303823432 • 0

0

Entering edit mode

I encourage you to contact the TCGAbiolinks developers. The functions that they provide are very specific and not many have used them.

ADD REPLY • link 5.4 years ago by Kevin Blighe 87k

0

Entering edit mode

Ok, thank you. I will try.

ADD REPLY • link 5.4 years ago by 1303823432 • 0

Kevin Blighe · Accepted Answer · 2018-03-12

3

Entering edit mode

6.1 years ago

Kevin Blighe 87k

Just add these to your code:

library(SummarizedExperiment)
library(GenomicRanges)

They are already installed on your R version but you do not explicitly load them.

ADD COMMENT • link 6.1 years ago by Kevin Blighe 87k

0

Entering edit mode

it still can't solve this problem

ADD REPLY • link 5.4 years ago by 1303823432 • 0

0

Entering edit mode

You are not the person who opened this question; thus, your problem is likely different. Which error message are you receiving?

ADD REPLY • link 5.4 years ago by Kevin Blighe 87k

0

Entering edit mode

Hi,I can't solve the same problem by your method.

[R.app GUI 1.70 (7543) x86_64-apple-darwin15.6.0]

[History restored from /Users/chaisongshan/.Rapp.history]

library(TCGAbiolinks)
library(stringr)
library(SummarizedExperiment)
载入需要的程辑包：GenomicRanges
载入需要的程辑包：stats4
载入需要的程辑包：BiocGenerics
载入需要的程辑包：parallel

载入程辑包：‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colMeans,
    ...

载入需要的程辑包：S4Vectors

载入程辑包：‘S4Vectors’

The following object is masked from ‘package:base’:

    expand.grid

载入需要的程辑包：IRanges
载入需要的程辑包：GenomeInfoDb
载入需要的程辑包：Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

载入需要的程辑包：DelayedArray
载入需要的程辑包：matrixStats

载入程辑包：‘matrixStats’

The following objects are masked from ‘package:Biobase’:

    anyMissing, rowMedians

载入需要的程辑包：BiocParallel

载入程辑包：‘DelayedArray’

The following objects are masked from ‘package:matrixStats’:

    colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges

The following objects are masked from ‘package:base’:

    aperm, apply

library(GenomicRanges)
# Samples
matched_met_exp <- function(project, n = NULL){
 message("Download DNA methylation information")
 met450k <- GDCquery(project = project,
 data.category = "DNA methylation",
 platform = "Illumina Human Methylation 450",
 legacy = TRUE,
 sample.type = c("Primary solid Tumor"))
 met450k.tp <- met450k$results[[1]]$cases

 # get primary solid tumor samples: RNAseq
 message("Download gene expression information")
 exp <- GDCquery(project = project,
 data.category = "Gene expression",
 data.type = "Gene expression quantification",
 platform = "Illumina HiSeq",
 file.type = "results",
 sample.type = c("Primary solid Tumor"),
 legacy = TRUE)
 exp.tp <- exp$results[[1]]$cases
 # Get patients with samples in both platforms
 patients <- uniquesubstrexp.tp,1,15)[substrexp.tp,1,12) %in% substrmet450k.tp,1,12)])
 if(!is.null(n)) patients <- patients[1:n] # get only n samples
 return(patients)
}

lgg.samples <- matched_met_exp("TCGA-LGG", n = 10)
Download DNA methylation information

gbm.samples <- matched_met_exp("TCGA-GBM", n = 10)
Download DNA methylation information

samples <- c(lgg.samples,gbm.samples)

query.lgg <- GDCquery(project = "TCGA-LGG",
+ data.category = "DNA methylation",
+ platform = "Illumina Human Methylation 450",
+ legacy = TRUE, barcode = lgg.samples)

GDCdownload(query.lgg)
Downloading data for project TCGA-LGG
Of the 10 files for download 10 already exist.
All samples have been already downloaded
met.lgg <-GDCprepare(query.lgg, save = FALSE)
  |=====================================================================| 100%Downloading genome information (try:0) Using: Homo sapiens genes (GRCh37.p13)

query.gbm <- GDCquery(project = "TCGA-GBM",
+ data.category = "DNA methylation",
+ platform = "Illumina Human Methylation 450",
+ legacy = TRUE, barcode = gbm.samples)

GDCdownload(query.gbm)
met.gbm <- GDCprepare(query.gbm, save = FALSE)
  |=====================================================================| 100%Downloading genome information (try:0) Using: Homo sapiens genes (GRCh37.p13)

met <- SummarizedExperiment::cbind(met.lgg, met.gbm)

met <- subset(met,subset = rowSumsis.na(SummarizedExperiment::assay(met))) == 0))

met <- subset(met,subset = !as.character(seqnames(met)) %in% c("chrNA","chrX","chrY"))

TCGAvisualize_meanMethylation(met, groupCol = "disease_type", group.legend = "Groups", filename = "mean_lgg_gbm.png", print.pvalue = TRUE)
==================== DATA Summary ====================
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
  'x' must be atomic

ADD REPLY • link updated 5.4 years ago by Kevin Blighe 87k • written 5.4 years ago by 1303823432 • 0

0

Entering edit mode

I tidied your code to improve the visualisation of it. From where did you obtain this code? There are many 'bugs' (errors) with TCGAbiolinks.