Question: Problem in using TCGAvisualize_meanMethylation function
0
gravatar for mahmood.amiri225
11 months ago by
mahmood.amiri2250 wrote:

Hi, I used ‘TCGAbiolinks’ version 2.6.12. I have just tried to plot using TCGAvisualize_meanMethylation function, using flowing codes, but the R script didn't work well and flowing error came out:

R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> #----------------------------
> # Obtaining DNA methylation
> #----------------------------
> setwd("F:/worlshop_EWAS/New/TCGA/Matched_ex_met")
> library(TCGAbiolinks)
> library(stringr)
> matched_met_exp <- function(project, n = NULL){
+ # get primary solid tumor samples: DNA methylation
+ message("Download DNA methylation information")
+ met450k <- GDCquery(project = project,
+                     data.category = "DNA methylation",
+                     platform = "Illumina Human Methylation 450",
+                     legacy = TRUE,
+                     sample.type = c("Primary solid Tumor"))
+ met450k.tp <- met450k$results[[1]]$cases
+ # get primary solid tumor samples: RNAseq
+ message("Download gene expression information")
+ exp <- GDCquery(project = project,
+                 data.category = "Gene expression",
+                 data.type = "Gene expression quantification",
+                 platform = "Illumina HiSeq",
+                 file.type = "results",
+                 sample.type = c("Primary solid Tumor"),
+ legacy = TRUE)
+ exp.tp <- exp$results[[1]]$cases
+ printexp.tp[1:10])
+ # Get patients with samples in both platforms
+ patients <- unique(substrexp.tp,1,15)[substrexp.tp,1,12) %in% substrmet450k.tp,1,12)])
+ if(!is.null(n)) patients <- patients[1:n] # get only n samples
+ return(patients)
+ }
> lgg.samples <- matched_met_exp("TCGA-LGG", n = 10)
Download DNA methylation information
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-LGG
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By sample.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
Download gene expression information
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-LGG
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By data.type
ooo By file.type
ooo By sample.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
 [1] "TCGA-HT-7480-01A-11R-2090-07" "TCGA-QH-A6XC-01A-12R-A32Q-07"
 [3] "TCGA-HT-A614-01A-11R-A29R-07" "TCGA-HT-8104-01A-11R-2404-07"
 [5] "TCGA-TQ-A7RG-01A-11R-A33Z-07" "TCGA-HW-7487-01A-11R-2027-07"
 [7] "TCGA-FG-8186-01A-11R-2256-07" "TCGA-P5-A5EX-01A-12R-A28M-07"
 [9] "TCGA-CS-6667-01A-12R-2027-07" "TCGA-DU-7304-01A-12R-2090-07"
> gbm.samples <- matched_met_exp("TCGA-GBM", n = 10)
Download DNA methylation information
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-GBM
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By sample.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
Download gene expression information
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-GBM
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By data.type
ooo By file.type
ooo By sample.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
 [1] "TCGA-06-0184-01A-01R-1849-01" "TCGA-06-0649-01B-01R-1849-01"
 [3] "TCGA-02-2485-01A-01R-1849-01" "TCGA-26-5136-01B-01R-1850-01"
 [5] "TCGA-76-4925-01A-01R-1850-01" "TCGA-32-2632-01A-01R-1850-01"
 [7] "TCGA-06-5858-01A-01R-1849-01" "TCGA-28-5213-01A-01R-1850-01"
 [9] "TCGA-06-0157-01A-01R-1849-01" "TCGA-16-0846-01A-01R-1850-01"
> samples <- c(lgg.samples,gbm.samples)
> query.lgg <- GDCquery(project = "TCGA-LGG",
+                       data.category = "DNA methylation",
+                       platform = "Illumina Human Methylation 450",
+                       legacy = TRUE, barcode = lgg.samples)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-LGG
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By barcode
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
> met.lgg <-GDCprepare(query.lgg, save = FALSE)
  |====================================================================================| 100%Downloading genome information (try:0) Using: Homo sapiens genes (GRCh37.p13)
Loading from disk
Starting to add information to samples
 => Add clinical information to samples
Add FFPE information. More information at: 
=> https://cancergenome.nih.gov/cancersselected/biospeccriteria 
=> http://gdac.broadinstitute.org/runs/sampleReports/latest/FPPP_FFPE_Cases.html
 => Adding subtype information to samples
lgg subtype information from:doi:10.1016/j.cell.2015.12.028
> query.gbm <- GDCquery(project = "TCGA-GBM",
+                       data.category = "DNA methylation",
+                       platform = "Illumina Human Methylation 450",
+                       legacy = TRUE, barcode = gbm.samples)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-GBM
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By barcode
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
> met.gbm <- GDCprepare(query.gbm, save = FALSE)
  |====================================================================================| 100%Downloading genome information (try:0) Using: Homo sapiens genes (GRCh37.p13)
Loading from disk
Starting to add information to samples
 => Add clinical information to samples
Add FFPE information. More information at: 
=> https://cancergenome.nih.gov/cancersselected/biospeccriteria 
=> http://gdac.broadinstitute.org/runs/sampleReports/latest/FPPP_FFPE_Cases.html
 => Adding subtype information to samples
gbm subtype information from:doi:10.1016/j.cell.2015.12.028
> met <- SummarizedExperiment::cbind(met.lgg, met.gbm)
> met <- subset(met,subset = rowSumsis.na(assay(met))) == 0))
Error in assay(met) : could not find function "assay"
> # remove probes in chromossomes X, Y and NA
> met <- subset(met,subset = !as.character(seqnames(met)) %in% c("chrNA","chrX","chrY"))
Error in seqnames(met) : could not find function "seqnames"
> TCGAvisualize_meanMethylation(met,
+                                  groupCol = "disease_type",
+                                  group.legend = "Groups",
+                                  filename = "mean_lgg_gbm.png",
+                                  print.pvalue = TRUE)
==================== DATA Summary ====================
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
  'x' must be atomic

if you know the solution, would you tell me , thanks in advance !

R • 615 views
ADD COMMENTlink modified 9 weeks ago by 13038234320 • written 11 months ago by mahmood.amiri2250

Page27--page28;https://f1000researchdata.s3.amazonaws.com/manuscripts/11063/744c5428-15a5-4c3e-bc06-23727829c012_8923_-_houtan_noushmehr_v2.pdf?doi=10.12688/f1000research.8923.2&numberOfBrowsableCollections=14&numberOfBrowsableGateways=23

ADD REPLYlink written 9 weeks ago by 13038234320

Thank you. Unfortunately, this is an error for the developers of the program. However, they do not seem to be doing any further work on TCGAbiolinks. Take a look at the GitHub issues page: https://github.com/BioinformaticsFMRP/TCGAbiolinks/issues

You could obtain the methylation data from GDC direct, and analyse it that way.

ADD REPLYlink written 9 weeks ago by Kevin Blighe37k

Ok, I need use other packages. Do you have other workflow files to analyze methylation? Thank you.

ADD REPLYlink written 9 weeks ago by 13038234320

Hi, friend. I solve this problem. By changing the last line of code: TCGAvisualize_meanMethylation(met, groupCol = "name", group.legend = "Groups", filename = "mean_lgg_gbm.png", print.pvalue = TRUE)

The picture obtained seems to be the same. I modified the parameter:groupCol,because that I was inspired by the results of the code:

str(met)
.. .. .. ..$ name : chr [1:20] "Brain Lower Grade Glioma" "Brain Lower Grade Glioma" "Brain Lower Grade Glioma" "Brain Lower Grade Glioma" ...

Do you think that it is feasible to do so?

ADD REPLYlink modified 9 weeks ago by Kevin Blighe37k • written 9 weeks ago by 13038234320

The only question is that I get a different p-value. I am confused.

ADD REPLYlink written 9 weeks ago by 13038234320

I encourage you to contact the TCGAbiolinks developers. The functions that they provide are very specific and not many have used them.

ADD REPLYlink written 9 weeks ago by Kevin Blighe37k

Ok, thank you. I will try.

ADD REPLYlink written 9 weeks ago by 13038234320
2
gravatar for Kevin Blighe
11 months ago by
Kevin Blighe37k
Republic of Ireland
Kevin Blighe37k wrote:

Just add these to your code:

library(SummarizedExperiment)
library(GenomicRanges)

They are already installed on your R version but you do not explicitly load them.

ADD COMMENTlink modified 11 months ago • written 11 months ago by Kevin Blighe37k

it still can't solve this problem

ADD REPLYlink written 9 weeks ago by 13038234320

You are not the person who opened this question; thus, your problem is likely different. Which error message are you receiving?

ADD REPLYlink written 9 weeks ago by Kevin Blighe37k

Hi,I can't solve the same problem by your method.

[R.app GUI 1.70 (7543) x86_64-apple-darwin15.6.0]

[History restored from /Users/chaisongshan/.Rapp.history]

library(TCGAbiolinks)
library(stringr)
library(SummarizedExperiment)
载入需要的程辑包:GenomicRanges
载入需要的程辑包:stats4
载入需要的程辑包:BiocGenerics
载入需要的程辑包:parallel

载入程辑包:‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colMeans,
    ...

载入需要的程辑包:S4Vectors

载入程辑包:‘S4Vectors’

The following object is masked from ‘package:base’:

    expand.grid

载入需要的程辑包:IRanges
载入需要的程辑包:GenomeInfoDb
载入需要的程辑包:Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

载入需要的程辑包:DelayedArray
载入需要的程辑包:matrixStats

载入程辑包:‘matrixStats’

The following objects are masked from ‘package:Biobase’:

    anyMissing, rowMedians

载入需要的程辑包:BiocParallel

载入程辑包:‘DelayedArray’

The following objects are masked from ‘package:matrixStats’:

    colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges

The following objects are masked from ‘package:base’:

    aperm, apply

library(GenomicRanges)
# Samples
matched_met_exp <- function(project, n = NULL){
 message("Download DNA methylation information")
 met450k <- GDCquery(project = project,
 data.category = "DNA methylation",
 platform = "Illumina Human Methylation 450",
 legacy = TRUE,
 sample.type = c("Primary solid Tumor"))
 met450k.tp <- met450k$results[[1]]$cases

 # get primary solid tumor samples: RNAseq
 message("Download gene expression information")
 exp <- GDCquery(project = project,
 data.category = "Gene expression",
 data.type = "Gene expression quantification",
 platform = "Illumina HiSeq",
 file.type = "results",
 sample.type = c("Primary solid Tumor"),
 legacy = TRUE)
 exp.tp <- exp$results[[1]]$cases
 # Get patients with samples in both platforms
 patients <- uniquesubstrexp.tp,1,15)[substrexp.tp,1,12) %in% substrmet450k.tp,1,12)])
 if(!is.null(n)) patients <- patients[1:n] # get only n samples
 return(patients)
}

lgg.samples <- matched_met_exp("TCGA-LGG", n = 10)
Download DNA methylation information

gbm.samples <- matched_met_exp("TCGA-GBM", n = 10)
Download DNA methylation information

samples <- c(lgg.samples,gbm.samples)

query.lgg <- GDCquery(project = "TCGA-LGG",
+ data.category = "DNA methylation",
+ platform = "Illumina Human Methylation 450",
+ legacy = TRUE, barcode = lgg.samples)

GDCdownload(query.lgg)
Downloading data for project TCGA-LGG
Of the 10 files for download 10 already exist.
All samples have been already downloaded
met.lgg <-GDCprepare(query.lgg, save = FALSE)
  |=====================================================================| 100%Downloading genome information (try:0) Using: Homo sapiens genes (GRCh37.p13)

query.gbm <- GDCquery(project = "TCGA-GBM",
+ data.category = "DNA methylation",
+ platform = "Illumina Human Methylation 450",
+ legacy = TRUE, barcode = gbm.samples)

GDCdownload(query.gbm)
met.gbm <- GDCprepare(query.gbm, save = FALSE)
  |=====================================================================| 100%Downloading genome information (try:0) Using: Homo sapiens genes (GRCh37.p13)

met <- SummarizedExperiment::cbind(met.lgg, met.gbm)

met <- subset(met,subset = rowSumsis.na(SummarizedExperiment::assay(met))) == 0))

met <- subset(met,subset = !as.character(seqnames(met)) %in% c("chrNA","chrX","chrY"))

TCGAvisualize_meanMethylation(met, groupCol = "disease_type", group.legend = "Groups", filename = "mean_lgg_gbm.png", print.pvalue = TRUE)
==================== DATA Summary ====================
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
  'x' must be atomic
ADD REPLYlink modified 9 weeks ago by Kevin Blighe37k • written 9 weeks ago by 13038234320

I tidied your code to improve the visualisation of it. From where did you obtain this code? There are many 'bugs' (errors) with TCGAbiolinks.

ADD REPLYlink written 9 weeks ago by Kevin Blighe37k

I have reproduced the error on my computer. From where did you obtain the code?

ADD REPLYlink written 9 weeks ago by Kevin Blighe37k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2093 users visited in the last hour