How to analyse TCGA CNV data with GAIA R package
0
0
Entering edit mode
22 months ago

Hello Biostars,

I am aiming to analyse CNV data associated with TCGA BRCA through GAIA specifically chromosome 8. I have created a script based on this previous thread https://www.biostars.org/p/311199/#311746. I am getting errors when I start GAIA analysis and I am hoping for any guidance on this matter.

...................................................

#load libraries
library(gaia)
library(dplyr)
library(tidyr)

project <- GDCquery("TCGA-BRCA", data.category = "Copy Number Variation",
data.type = "Masked Copy Number Segment", legacy = FALSE)
BRCA <- GDCprepare(project)
write.table(BRCA, file="BRCA.txt") #save file

## Create marker file
url<- "https://gdc.cancer.gov/files/public/file/snp6.na35.liftoverhg38.txt.zip"
temp <- tempfile()
unzip(temp)
unique(probes_metadata$Chromosome) probes_metadata[which(probes_metadata$Chromosome=="X"),"Chromosome"] <- 23 # rename chx and y
probes_metadata[which(probes_metadata$Chromosome=="Y"),"Chromosome"] <- 24 probes_metadata$Chromosome <- sapply(probes_metadata$Chromosome, as.integer) markerID <- apply(probes_metadata, 1, function(x) paste0(x[2], ":", x[3])) markersMatrix <- probes_metadata[-which(duplicated(markerID)),] markers_obj <- load_markers(markersMatrix) # marker file ## convert segment means into 1/0 synthCNV_Matrix <- cbind(BRCA,Label=NA) synthCNV_Matrix[synthCNV_Matrix[,"Segment_Mean"] < -0.2,"Label"] <- 0 synthCNV_Matrix[synthCNV_Matrix[,"Segment_Mean"] > 0.2,"Label"] <- 1 synthCNV_Matrix <- synthCNV_Matrix[!is.na(synthCNV_Matrix$Label),]

#rearrange columns
synthCNV_Matrix<- synthCNV_Matrix[,c(7,2,3,4,5,8)]
colnames(synthCNV_Matrix)<- c("Sample.Name", "Chromosome", "Start", "End", "Num.of.Markers", "Aberration")

#Replace x and y chromosome names
xidx <- which(synthCNV_Matrix$Chromosome=="X") yidx <- which(synthCNV_Matrix$Chromosome=="Y")
synthCNV_Matrix[xidx,"Chromosome"] <- 23
synthCNV_Matrix[yidx,"Chromosome"] <- 24
synthCNV_Matrix$Chromosome <- sapply(synthCNV_Matrix$Chromosome,as.integer)

#number of unique samples
n <- length(unique(synthCNV_Matrix\$Sample.Name))

#CNV analysis with gaia
results.er <- runGAIA(synthCNV_Matrix, markers_obj, output_file_name="Tumor.all.txt", aberrations=-1, chromosomes=-1, num_iterations=10, threshold=0.15)


....................................................

The following is the error message that I get

cnv_obj<- load_cnv(synthCNV_Matrix, markers_obj, n)
.Error in start_index:end_index : argument of length 0


I would appreciate any help on this matter.

CNV TCGA GAIA R • 1.2k views
1
Entering edit mode

Hi, can you please post the error messages. Thank you in advance.

0
Entering edit mode

Hi I have updated my post to include the error message. Thank you.

1
Entering edit mode

Hi again. I had originally posted an answer, but it still seems to be returning an error at a later point, still for load_cnv(). Let me debug it.

Note that your final line should be:

results.er <- runGAIA(cnv_obj,
markers_obj,
output_file_name = 'Tumor.all.txt',
aberrations = -1,
chromosomes = -1,
num_iterations = 10,
threshold = 0.15)

1
Entering edit mode

The problem likely relates to the fact that your markers matrix is [apparently] based on the SNP 6.0 array, which will not necessarily cover the data that is being retrieved via TCGAbiolinks. It's difficult to diagnose what is happening.

Note that if you want the BRCA CNV data, this can be retrieved in the same way as per my other thread, and then the original workflow that I wrote can also be followed.

1
Entering edit mode

There is a another potential solution mentioned here: https://support.bioconductor.org/p/111990/#9135686