How to analyse TCGA CNV data with GAIA R package
0
0
Entering edit mode
3.4 years ago

Hello Biostars,

I am aiming to analyse CNV data associated with TCGA BRCA through GAIA specifically chromosome 8. I have created a script based on this previous thread https://www.biostars.org/p/311199/#311746. I am getting errors when I start GAIA analysis and I am hoping for any guidance on this matter.

...................................................

#load libraries 
library(TCGAbiolinks)
library(gaia)
library(dplyr)
library(tidyr)

# download tcga CNV data 
project <- GDCquery("TCGA-BRCA", data.category = "Copy Number Variation", 
                    data.type = "Masked Copy Number Segment", legacy = FALSE) 
GDCdownload(project)
BRCA <- GDCprepare(project)
write.table(BRCA, file="BRCA.txt") #save file

## Create marker file 
url<- "https://gdc.cancer.gov/files/public/file/snp6.na35.liftoverhg38.txt.zip" 
temp <- tempfile() 
download.file(url = url, temp) 
unzip(temp) 
probes_metadata<- read.table("snp6.na35.liftoverhg38.txt", sep = "\t",as.is = TRUE) 
colnames(probes_metadata) <- probes_metadata[1,] #rename columns 
probes_metadata <- probes_metadata[-1,] 
probes_metadata=probes_metadata[probes_metadata[,"freqcnv"]==FALSE,] #get rid of unfrequent CNVs
colnames(probes_metadata)[1:4] <- c("Probe_name","Chromosome", "Start", "Strand")
head(probes_metadata)
unique(probes_metadata$Chromosome) 
probes_metadata[which(probes_metadata$Chromosome=="X"),"Chromosome"] <- 23 # rename chx and y
probes_metadata[which(probes_metadata$Chromosome=="Y"),"Chromosome"] <- 24
probes_metadata$Chromosome <- sapply(probes_metadata$Chromosome, as.integer)
markerID <- apply(probes_metadata, 1, function(x) paste0(x[2], ":", x[3]))
markersMatrix <- probes_metadata[-which(duplicated(markerID)),]
markers_obj <- load_markers(markersMatrix) # marker file 

## convert segment means into 1/0 
synthCNV_Matrix <- cbind(BRCA,Label=NA) 
synthCNV_Matrix[synthCNV_Matrix[,"Segment_Mean"] < -0.2,"Label"] <- 0 
synthCNV_Matrix[synthCNV_Matrix[,"Segment_Mean"] > 0.2,"Label"] <- 1 
synthCNV_Matrix <- synthCNV_Matrix[!is.na(synthCNV_Matrix$Label),]

#rearrange columns 
synthCNV_Matrix<- synthCNV_Matrix[,c(7,2,3,4,5,8)] 
colnames(synthCNV_Matrix)<- c("Sample.Name", "Chromosome", "Start", "End", "Num.of.Markers", "Aberration")

#Replace x and y chromosome names 
xidx <- which(synthCNV_Matrix$Chromosome=="X") 
yidx <- which(synthCNV_Matrix$Chromosome=="Y") 
synthCNV_Matrix[xidx,"Chromosome"] <- 23 
synthCNV_Matrix[yidx,"Chromosome"] <- 24 
synthCNV_Matrix$Chromosome <- sapply(synthCNV_Matrix$Chromosome,as.integer)

#number of unique samples
n <- length(unique(synthCNV_Matrix$Sample.Name))

#CNV analysis with gaia
cnv_obj<- load_cnv(synthCNV_Matrix, markers_obj, n)
results.er <- runGAIA(synthCNV_Matrix, markers_obj, output_file_name="Tumor.all.txt", aberrations=-1, chromosomes=-1, num_iterations=10, threshold=0.15)

....................................................

The following is the error message that I get

cnv_obj<- load_cnv(synthCNV_Matrix, markers_obj, n)
Loading Copy Number Data
.Error in start_index:end_index : argument of length 0

I would appreciate any help on this matter.

CNV TCGA GAIA R • 2.3k views
ADD COMMENT
1
Entering edit mode

Hi, can you please post the error messages. Thank you in advance.

ADD REPLY
0
Entering edit mode

Hi I have updated my post to include the error message. Thank you.

ADD REPLY
1
Entering edit mode

Hi again. I had originally posted an answer, but it still seems to be returning an error at a later point, still for load_cnv(). Let me debug it.

Note that your final line should be:

results.er <- runGAIA(cnv_obj,
  markers_obj,
  output_file_name = 'Tumor.all.txt',
  aberrations = -1,
  chromosomes = -1,
  num_iterations = 10,
  threshold = 0.15)
ADD REPLY
1
Entering edit mode

The problem likely relates to the fact that your markers matrix is [apparently] based on the SNP 6.0 array, which will not necessarily cover the data that is being retrieved via TCGAbiolinks. It's difficult to diagnose what is happening.

Note that if you want the BRCA CNV data, this can be retrieved in the same way as per my other thread, and then the original workflow that I wrote can also be followed.

ADD REPLY
1
Entering edit mode

There is a another potential solution mentioned here: https://support.bioconductor.org/p/111990/#9135686

ADD REPLY

Login before adding your answer.

Traffic: 861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6