Question: About TCGA CNV data preprocessing
1
gravatar for Eric Wang
21 months ago by
Eric Wang30
Harbin
Eric Wang30 wrote:

Hello all, I am looking at the Level 3 CNV files on TCGA. I have a few questions:

I download copy number variance data from TCGA database and mapped genomic regions to gene symbols using this method(https://www.biostars.org/p/311199/#311746). Now i get a matrix that its rows are genes and its columns are samples. If I want to use this data to cluster samples, how do I pre process the data? (p.s. for probes mapped to the same gene, I averaged their Segment_mean values, right?)

Thanks for any help,

cnv snp • 2.1k views
ADD COMMENTlink modified 21 months ago by Kevin Blighe61k • written 21 months ago by Eric Wang30
4
gravatar for Kevin Blighe
21 months ago by
Kevin Blighe61k
University College London
Kevin Blighe61k wrote:

Hey,

You are referring to a post that I made. From where did you obtain the original data? - Broad Firebrowse (somatic copy number alterations) or just downloaded the original files from GDC?

If you followed the data processing exactly as follows:

  • Part I - download segmented sCNA data for any TCGA cohort from Broad Institute's FireBrowse server and identify recurrent sCNA regions in these with GAIA
  • Part II - plot recurrent sCNA gains and losses from GAIA
  • Part III - annotate the recurrent sCNA regions (this post, just below)
  • Part IV - generate heatmap of recurrent sCNA regions over your cohort

Then, the statistically significant recurrent somatic copy number alterations (sCNA) are held in the *.igv.gistic files. You can extract statistically significant regions from this file and then pull out the original copy number over these on a per sample basis using GenomicRanges - the copy number that you take is indeed the segment mean from the original copy number program that was used (in the case of TCGA data, likely DNAcopy (R)).

If you do that, then you can build a matrix of:

  • statistically significant recurrent sCNAs in a group of patients as rows
  • patients as columns
  • Segment Mean over each region as the values

With that, I generated this and identified clusters of patients based on recurrent sCNA via Partitioning Around Medoids (PAM)::

fgh

[from: https://www.ncbi.nlm.nih.gov/pubmed/29682207]

Of course, you don't have to use that data, exactly, but you really have to know to what your data relates.

Kevin

ADD COMMENTlink modified 3 months ago • written 21 months ago by Kevin Blighe61k

Hi, Thanks for your help! I just downloaded the CMV data from TCGA database and mapped genomic regions to gene symbols using the following code:

mart <- useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")
genes <- getBM(attributes=c("hgnc_symbol","chromosome_name","start_position","end_position"), mart=mart)
genes <- genes[genes[,1]!="" & genes[,2] %in% c(1:22,"X","Y"),]
colnames(genes) <- c("GeneSymbol","Chr","Start","End")
genes_GR <- makeGRangesFromDataFrame(genes,keep.extra.columns = TRUE)
df = read.csv("BRCA-CNV.csv") # CNV data downloaded from TCGA database.
cnv =  makeGRangesFromDataFrame(df, keep.extra.columns = TRUE)
hits <- findOverlaps(genes_GR, cnv, type="within")
df_ann <- cbind(df[subjectHits(hits),],genes[queryHits(hits),])
df_ann <- unique(df_ann)
write.table(df_ann, file="BRCA-CNV-withGeneSymbol.txt", quote = F, sep = "\t", row.names = F)

Then I constructed the CNV matrix using the following code:

cnv <- as.matrix(read.table("BRCA-CNV-withGeneSymbol.txt"))
cnv <- cnv[,c(7:9)]
samples.tumor <- samples[which(substr(samples, 14, 16) %in% c("01A", "01B", "06A"))]
tumor <- matrix(0, nrow = length(genes), ncol = length(samples.tumor))
colnames(tumor) <- samples.tumor
rownames(tumor) <- genes
for(i in 1:nrow(tumor)){
    for(j in 1:ncol(tumor)){
        gene.tmp <- rownames(tumor)[i]
        sample.tmp <- colnames(tumor)[j]
        cnv.tmp <- cnv[which(cnv[,3] %in% gene.tmp),]
        tumor[i,j] <- mean(as.numeric(cnv.tmp[which(cnv.tmp[,2] %in% sample.tmp),1]), rm.na = T)
    }
}

At last, I got the CNV matrix. Is there any error in this process? The final data is like this:

                                gene1      gene2       gene3       gene4      gene5
TCGA-LD-A9QF-01A-32D-A41E-01 -0.3258360  0.2291041 -0.29244881  0.4456383  0.2990137
TCGA-EW-A1OX-01A-11D-A141-01  5.4529917 -0.6737190 -0.17351193  0.2469372  0.4106143
TCGA-A2-A0YT-01A-11D-A107-01  0.5855575 -1.4167641 -0.09580651 -1.9142909 -0.7032367
TCGA-E2-A14N-01A-31D-A134-01 -0.5166007 -0.4208664  0.64834184  0.1254564  0.3653708
TCGA-BH-A18F-01A-11D-A12A-01 -1.4072964 -0.5098043  0.22809823  0.1885511 -0.1663476
ADD REPLYlink modified 21 months ago • written 21 months ago by Eric Wang30
1

You just took the tumour samples, merged them together, and then summarised copy number per gene? There's nothing wrong with that; however, if you try to publish it, a reviewer would ask if these copy number events are somatic or germline. Without subtracting the normal copy number profiles, you cannot know whether or not these are true somatic events. It depends on what exactly you want to do, of course.

  • Germline copy number event = copy number variation
  • Somatic copy number event = copy number alteration

The last time that I checked, the open access (level 3) TCGA copy number data is output per sample and is available separately for tumour and normal at the Genomic Data Commons (GDC). This is the benefit of just taking the Broad FireBrowse data, i.e., because they have re-processed this level 3 data to produce somatic copy number events, available as files with names like this: 'genome_wide_snp_6-segmented_scna_minus_germline_cnv_hg19' ( http://firebrowse.org/?cohort=BRCA&download_dialog=true )

ADD REPLYlink modified 3 months ago • written 21 months ago by Kevin Blighe61k

I have downloaded the masked CNV data filtered germline variation as input data and get the average of all probes per gene. If I just want to use CNV data for cluster analysis, is this right? Thank you for your sharing. Now I know the difference between variation and alteration. Thanks again!

ADD REPLYlink written 21 months ago by Eric Wang30

Hey, it was not I who up-voted your comment. I am reading your comment for the first time right now. If that is the data that you have, then it seems okay!

ADD REPLYlink written 21 months ago by Kevin Blighe61k

@Kevin Blighe Hello Kevin, if i want to extract the genes from both tumor and normal samples That will make more sense instead or no

Actually, my data have "01A", "10A and "11A" and i also want to built the CNV matrix, I tried @Django code but an error is shown?

ADD REPLYlink written 20 months ago by Chaimaa220

@Kevin Blighe Hello Kevin Could you plz answer my question?

ADD REPLYlink written 20 months ago by Chaimaa220
1

You have not shown the error message, so, how can I advise further?

ADD REPLYlink written 20 months ago by Kevin Blighe61k

@Kevin Blighe Hi, Kevin sorry for that here is the piece of code i tried and the following error

    cnv <- as.matrix(read.table("CNV-withGeneSymbol.txt"))
    cnv <- cnv[,c(7:9)]
    samples <- gsub("TCGA-[A-Z0-9]*-[A-Z0-9]*-", "", (gsub("-[0-9A-Z]*-[0-9A-Z]*-01$", "", cnv[,1])))
    samples.tumor <- samples[which(substr(samples, 14, 16) %in% c("01A", "01B", "06A"))]
    tumor <- matrix(0, nrow = length(genes), ncol = length(samples.tumor))
    colnames(tumor) <- samples.tumor
    rownames(tumor) <- genes
    for(i in 1:nrow(tumor)){
        for(j in 1:ncol(tumor)){
            gene.tmp <- rownames(tumor)[i]
            sample.tmp <- colnames(tumor)[j]
            cnv.tmp <- cnv[which(cnv[,3] %in% gene.tmp),]
            tumor[i,j] <- mean(as.numeric(cnv.tmp[which(cnv.tmp[,2] %in% sample.tmp),1]), rm.na = T)
    }
    }

Error shown here; Error in [<-(*tmp*, i, j, value = mean(as.numeric(cnv.tmp[which(cnv.tmp[, : subscript out of bounds

ADD REPLYlink written 20 months ago by Chaimaa220
1

Hey, thank you! Can you show the first few lines and columns of the object cnv? Just paste them here, highlight them, and then click the 101 010 button

ADD REPLYlink written 20 months ago by Kevin Blighe61k

@Kevin Blighe, I'm working on RGui(64-bit) and the objects are not shown here. I have posted the full code i applied, and i can share the data as well if y want ?

library(GenomicRanges)
> mart <- useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")
> mart <- useMart(biomart="ENSEMBL_MART_ENSEMBL", host="grch37.ensembl.org", path="/biomart/martservice", dataset="hsapiens_gene_ensembl")
> genes <- getBM(attributes=c("hgnc_symbol","chromosome_name","start_position","end_position"), mart=mart)
> genes <- genes[genes[,1]!="" & genes[,2] %in% c(1:22,"X","Y"),]
> xidx <- which(genes[,2]=="X")
> yidx <- which(genes[,2]=="Y")
> genes[xidx, 2] <- 23
> genes[yidx, 2] <- 24
> genes[,2] <- sapply(genes[,2],as.integer)
> genes <- genes[order(genes[,3]),]
> genes <- genes[order(genes[,2]),]
> colnames(genes) <- c("GeneSymbol","Chr","Start","End")
> df<- read.table("E:/results/Stage1_CNV_data_v2.txt", sep="\t", stringsAsFactors=FALSE, header=TRUE)
> colnames(df) <- c("Barcode", "chr", "start", "end", "extra1", "extra2")
> cnv <-  makeGRangesFromDataFrame(df, keep.extra.columns = TRUE)
> hits <- findOverlaps(genes_GR, cnv, type="within")
> df_ann <- cbind(df[subjectHits(hits),],genes[queryHits(hits),])
> df_ann <- unique(df_ann)
> write.table(df_ann, file="outputfile-GeneSymbol.txt", quote = F, sep = "\t", row.names = F)
> cnv <- as.matrix(read.table("outputfile-GeneSymbol.txt"))
> cnv <- cnv[,c(7:9)]
> samples <- gsub("TCGA-[A-Z0-9]*-[A-Z0-9]*-", "", (gsub("-[0-9A-Z]*-[0-9A-Z]*-01$", "", cnv[,1])))
> samples.tumor <- samples[which(substr(samples, 14, 16) %in% c("01A", "01B", "06A"))]
> tumor <- matrix(0, nrow = length(genes), ncol = length(samples.tumor))
> colnames(tumor) <- samples.tumor
> rownames(tumor) <- genes
> for(i in 1:nrow(tumor)){
+     for(j in 1:ncol(tumor)){
+         gene.tmp <- rownames(tumor)[i]
+         sample.tmp <- colnames(tumor)[j]
+         cnv.tmp <- cnv[which(cnv[,3] %in% gene.tmp),]
+         tumor[i,j] <- mean(as.numeric(cnv.tmp[which(cnv.tmp[,2] %in% sample.tmp),1]), rm.na = T)
+ }
+ }
Error in `[<-`(`*tmp*`, i, j, value = mean(as.numeric(cnv.tmp[which(cnv.tmp[,  : 
  subscript out of bounds
ADD REPLYlink written 20 months ago by Chaimaa220

Here is the data

ADD REPLYlink written 20 months ago by Chaimaa220
1

I am not exactly sure what you are aiming to do with this for loop. However, the way to debug them is, after they crash and produce the error, check the last value of i and j, and then take a look at those indices in tumor and cnv. Then you will be able to debug it.

ADD REPLYlink written 20 months ago by Kevin Blighe61k

@Kevin Blighe hi Kevin, So, i have to change the values of i and j and then run the for loop again or what? I didn't get exactly what do you mean by taking a look and then debug it again?

Actually, i want to construct the CNV matrix (rows represent the samples and columns represent the annotated genes) .

Exactly, i want to integrate this data with others multi-omics datasets such as gene expression and DNA methylation for further analysis. All these datasets have such format except CNV data which i get stuck to reformat it now. That's why i used that for loop to get the matrix and maybe after that i can merge all of these datasets together. Is this for loop right or no to make such matrix?

ADD REPLYlink written 20 months ago by Chaimaa220
1

Yes, but what is the value of i and j after the error is produced?

You may need to do the following:

  1. define a unique list of regions (chr:start:end) across all of your samples
  2. create an empty data-frame with just the unique list of regions as row-names - this data-frame will eventually hold your final output
  3. loop over your input data per sample (i), and output 'segment mean' where there is overlap with each segment (j). Overlap can be determined by GenomicRanges
ADD REPLYlink written 20 months ago by Kevin Blighe61k

@Kevin Blighe Hi Kevin, Thanks for yr quick reply! your helpful suggestions!

The last values of i and j are 1L.

I want my output like a matrix of samples rows and genes in the columns with their corresponding segment values.

I'm more familiar with MATLAB, i used the output file produced from your annotation code and i write a piece of code that can allow allocating this matrix but my code there doesn(t work too. I tried the code in R which i share with you yesterday and also no results!

ADD REPLYlink written 20 months ago by Chaimaa220

@Kevin Blighe Hello. I'm currently working with that same goal (to create a dataframe containing samples in rows and genes in columns). I somehow managed to reproduce the code released in the first comment, though I'm not sure about the variables format. My cnv matrix is made up of ("Samples", "Chromosome", "Start", "End", "Num_probes", "Segment_mean", "GeneSymbol", "Chr", "Start" and "End"). I think I understood how to do the first and second point, though I'm not sure about the third one.

    cnv <- as.matrix(read.table("CNV-withGeneSymbol.txt"))
cnv <- cnv[,c(7:9)]
samples <- gsub("TCGA-[A-Z0-9]*-[A-Z0-9]*-", "", (gsub("-[0-9A-Z]*-[0-9A-Z]*-01$", "", cnv[,1])))
samples.tumor <- samples[which(substr(samples, 14, 16) %in% c("01A", "01B", "06A"))]
tumor <- matrix(0, nrow = length(genes), ncol = length(samples.tumor))
colnames(tumor) <- samples.tumor
rownames(tumor) <- genes
for(i in 1:nrow(tumor)){
    for(j in 1:ncol(tumor)){
        gene.tmp <- rownames(tumor)[i]
        sample.tmp <- colnames(tumor)[j]
        cnv.tmp <- cnv[which(cnv[,3] %in% gene.tmp),]
        tumor[i,j] <- mean(as.numeric(cnv.tmp[which(cnv.tmp[,2] %in% sample.tmp),1]), rm.na = T)
}
}

In this code, could you please specify the third column of cnv (after Samples and Genes) ? I'm not sure on which value you calculate the mean to produce the tumor matrix.

Thank you very much for your help :)

ADD REPLYlink written 8 weeks ago by random_user_260

Hi Juliette, for this, I actually used GenomicRanges, which makes it a lot easier. I can try to 'dig out' the code if you need it.

The steps were:

  1. define a list of unique regions across all samples
  2. determine copy number across each unique region per sample
  3. perform clustering and create heatmap

Are you familiar with GenomicRanges?

ADD REPLYlink written 8 weeks ago by Kevin Blighe61k

@Kevin Blighe I discovered GenomicRanges with your code. I tried to recreate the code, I wrote this

mart <- useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")
genes <- getBM(attributes=c("hgnc_symbol","chromosome_name","start_position","end_position"), mart=mart)
genes <- genes[genes[,1]!="" & genes[,2] %in% c(1:22,"X","Y"),]
colnames(genes) <- c("GeneSymbol","Chr","Start","End")
genes_GR <- makeGRangesFromDataFrame(genes,keep.extra.columns = TRUE)

df = brca.nocnv # obtained via the function GDCquery and GDCprepare
df[which(df$Chromosome=="X"),"Chromosome"] <- 23
df[which(df$Chromosome=="Y"),"Chromosome"] <- 24
cnv =  makeGRangesFromDataFrame(df, keep.extra.columns = TRUE)

hits <- findOverlaps(genes_GR, cnv, type="within")
df_ann <- cbind(df[subjectHits(hits),],genes[queryHits(hits),])
df_ann <- unique(df_ann)
write.table(df_ann, file="BRCA-CNV-withGeneSymbol.txt", quote = F, sep = "\t", row.names = F)

Then, I was planning to do the for loop, that I copied in my previous comment.

Dropping your code would be very helpful, thank you !

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by random_user_260

Hey juliette, I took a look and here is what I did (below). So, I use the .igv.gistic file that is ultimately produced for the entire cohort being analysed, and overlap these regions, which are statistically significant, with the original copy number segment mean per sample. The matrix that's produced from this process, in the heat object, ultimately forms the heatmap.

Let me know if you can follow this and adapt it to your own code.

  require(GenomicRanges)
  recCNV.Template.GR <- makeGRangesFromDataFrame(
    read.table("Tumor.txt.igv.gistic", sep="\t",
      header=TRUE), keep.extra.columns=TRUE)

  # Form a new data frame that will eventually hold all regions from the
  # teplate cnvMatrix and the respective CN segment value per sample
  heat <- data.frame(
    pastedata.framerecCNV.Template.GR)$seqnames,
      data.framerecCNV.Template.GR)$start,
      data.framerecCNV.Template.GR)$end, sep = '.'),
    data.framerecCNV.Template.GR))
  colnames(heat)[1] <- 'Key'

  # Read in the individal / original sample CN segment values
  CN.tumor.recCNV <- CN.tumor
  colnames(CN.tumor.recCNV) <- c('Sample','Chr','Start','End',
    'Num_Probes','Segment_Mean')

  # Get a sample listing
    samples <- unique(CN.tumor.recCNV$Sample)

  # Each loop overlaps individual sample CN segment calls with the
  # template cnvMatrix regions. If present, the CN segment value is added.
  mylist <- list()
  for (i in 1:length(samples)) {
    GR <- makeGRangesFromDataFrame(
      CN.tumor.recCNV[CN.tumor.recCNV$Sample==samples[i],],
      keep.extra.columns = TRUE)
    hits <- findOverlapsrecCNV.Template.GR, GR, type = "any")

    mylist[[i]] <- data.frame(
      recCNV.Template.GR[queryHits(hits),],
      GR[subjectHits(hits),])

    df.tmp <- mylist[[i]]
    df.tmp$key <- paste(df.tmp$seqnames, df.tmp$start, df.tmp$end,
      sep = '.')
    df.tmp <- df.tmp[match(heat$Key, df.tmp$key),]
    heat<- cbind(heat, df.tmp$Segment_Mean)
    colnames(heat)[11+i] <- samples[i]
}
heat[is.na(heat)] <- 0
ADD REPLYlink written 8 weeks ago by Kevin Blighe61k

@Kevin Blighe

Thank you very much for your answer. I tried to adapt it to what I already had. I used your pipeline to get RecCNV. Here is the code :

recCNV.Template.GR <- makeGRangesFromDataFrame(RecCNV, keep.extra.columns = TRUE)

heat <- data.framepastedata.framerecCNV.Template.GR)$seqnames,
                         data.framerecCNV.Template.GR)$start,
                         data.framerecCNV.Template.GR)$end, sep = '.'),
                   data.framerecCNV.Template.GR))
colnames(heat)[1] <- 'Key'

CN.tumor.recCNV <- cnvMatrix
colnames(CN.tumor.recCNV) <- c('Sample','Chr','Start','End','Num_Probes','Segment_Mean')

samples <- unique(CN.tumor.recCNV$Sample)

mylist <- list()


for (i in 1:length(samples)) {
  GR <- makeGRangesFromDataFrame(CN.tumor.recCNV[CN.tumor.recCNV$Sample==samples[i],],
                                 keep.extra.columns = TRUE)
  hits <- findOverlapsrecCNV.Template.GR, GR, type = "any")

  mylist[[i]] <- data.framerecCNV.Template.GR[queryHits(hits),],
                            GR[subjectHits(hits),])
  df.tmp <- mylist[[i]]
  df.tmp$key <- paste(df.tmp$seqnames, 
                      df.tmp$start, 
                      df.tmp$end,
                      sep = '.')
  df.tmp <- df.tmp[match(heat$Key, df.tmp$key),]
  heat<- cbind(heat, df.tmp$Segment_Mean)
  colnames(heat)[10+i] <- samples[i]
}
heat[is.na(heat)] <- 0

Then, I obtain heat with the columns (Key, seqnames, start, end, width, strand, aberration, q.value, df.tmp$Segment_Mean).

However, I'm not sure what you want to do with the command colnames(heat)[11+i] <- samples[i] (it seems that in my case, it would be colnames(heat)[10+i] but I'm not sure).

Again, thank you so much for taking the time to help !

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by random_user_260

Great. That colnames() command is not critical, so, probably best to use your own judgement about how to use it for your own data.

ADD REPLYlink written 7 weeks ago by Kevin Blighe61k

@Kevin Blighe Hi, I just wanted to let you know that I managed to get the heat matrix after all. Concerning that colnames command, I had only tried the loop for specific values of 'i', and therefore did not understand why you wanted to name the columns with samples. Anyway, thank you very much for your code and explanation, you helped a lot !

ADD REPLYlink written 7 weeks ago by random_user_260

That is great that you got it complete!

ADD REPLYlink written 7 weeks ago by Kevin Blighe61k

Hi @Kevin Blighe, Dear Kevin , is there any mistake in this command ? it doesnt work, with me

 heat <- data.frame(
        pastedata.framerecCNV.Template.GR)$seqnames,
          data.framerecCNV.Template.GR)$start,
          data.framerecCNV.Template.GR)$end, sep = '.'),
        data.framerecCNV.Template.GR))
ADD REPLYlink written 4 weeks ago by Chaimaa220
1

Thank you, i fix it

     heat <- data.frame(paste
data.framerecCNV.Template.GR)$seqnames,
    data.framerecCNV.Template.GR)$start,
    data.framerecCNV.Template.GR)$end, sep = '.'),
    data.framerecCNV.Template.GR))
ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Chaimaa220

@Kevin Blighe Hi Kevin, i have checked as you said but i couldn't fix it This is the first few lines and columns of the object cnv

 1
    GeneSymbol
    Chr
    Start
    2
    ARHGEF16
    1
    3370990
    3
    ARHGEF16
    1
    3370990
    4
    ARHGEF16
    1
    3370990
    5
    ARHGEF16
    1
    3370990
    6
    ARHGEF16
    1
    3370990
    7
    ARHGEF16
    1
    3370990

I already explain to you why i used this for Loop, hope you can help me plz!

ADD REPLYlink written 20 months ago by Chaimaa220

@Django i try your code but an error is shown Error in [<-(*tmp*, i, j, value = mean(as.numeric(cnv.tmp[which(cnv.tmp[, : subscript out of bounds Can you help me to fix it ?

ADD REPLYlink written 20 months ago by Chaimaa220
1

I think the Error code means there is no element in "sample.tmp". And which() will return nothing.

ADD REPLYlink written 20 months ago by Eric Wang30

@Django Itried your code and i think it has some missing line before samples, where is the object samples? could y plz fix it

ADD REPLYlink modified 20 months ago • written 20 months ago by Chaimaa220

Hello @Kevin Blighe can you plz tell me how to open this file "Tumor.All.txt.igv.gistic" and how to plot the heatmap using which package( ComlpexHeatmap package in your paper) (Can you plz provide the code for this step too).

With my respect to you!

ADD REPLYlink written 21 months ago by Chaimaa220
1

Hey bro, you can read back in the file like this:

require(GenomicRanges)
recCNV.Template.GR <- makeGRangesFromDataFrame(read.table("Tumor.All.txt.igv.gistic", sep="\t", header=TRUE), keep.extra.columns=TRUE)

This reads it into a GenomicRanges object.

Do you also have an object called CN.tumor in your R session?

ADD REPLYlink written 21 months ago by Kevin Blighe61k

@Kevin Blighe yes it should be there bcz i just follow your whole process step by step and everything work smoothly but only with FFPE file.

ADD REPLYlink written 21 months ago by Chaimaa220

Hi,is the object CN.tumor just the df? if not, df="Tumor.All.txt.igv.gistic"? thanks!

ADD REPLYlink written 3 months ago by 295353195510

Hi! That object is created here: C: How to extract the list of genes from TCGA CNV data

ADD REPLYlink written 3 months ago by Kevin Blighe61k

@Kevin Blighe , Hi dear Kevin, how you generated these 2 heatmaps please? i was wondering if the code that you used for the genration is availaibe and if you can share with me? i wloud like to apply it on my data, if that possible?

ADD REPLYlink written 15 days ago by Chaimaa220
1

Hey, you should read through this thread: C: About TCGA CNV data preprocessing

ADD REPLYlink written 15 days ago by Kevin Blighe61k

@Kevin Blighe , Hi dear Kevin, thanks for your reply, that thread i already check it and it works well for me, but i mean the heatmap visualization how you make those beautiful clusters, As far i know you have used ComplexHeatmap package. For the Dendrograms you used Euclidean distance and Ward’s linkage. Could you please share these parts if that possibe, i would like to apply it for my data ?

ADD REPLYlink modified 15 days ago • written 15 days ago by Chaimaa220
1

Hey, I have actually recently posted a complete tutorial here: https://github.com/kevinblighe/E-MTAB-6141

ADD REPLYlink written 14 days ago by Kevin Blighe61k
1

@ Kevin Blighe, Hi dear Kevin, Wow! Great job, I'm really proud of you, i will read the tutorial carefully, Thank you very very much!

ADD REPLYlink written 14 days ago by Chaimaa220
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1938 users visited in the last hour