Understanding gene level copy number data from TCGAbiolinks
0
0
Entering edit mode
2.1 years ago
billyK • 0

Hi all. Thanks in advance for helping me out.

I'm trying to analyze copy number data from TCGA (using TCGAbiolinks), and trying to define genes that are either amplified or deleted.

To download gene level copy number alteration, I used the code below:

query <- GDCquery(project = 'TCGA-BRCA', data.category = 'Copy Number Variation', data.type = 'Gene Level Copy Number', sample.type = 'Primary Tumor')

I have three questions related to the downloaded data.

First, I'm curious to know the pipeline used to calculate gene level copy numbers.

Seconly, I've noticed that some patients have gene level copy numbers that are unexpectedly huge. For example, 'TCGA-A8-A093-10A-01D-A012-01' had a copy number of 26 in a gene "ENSG00000085733.16". I'm curious to know if this is usual.

Finally, what would be a cutoff score for gene level copy number to define whether a gene is amplified or deleted?

Thank you so much for your help.

CNV TCGA • 1.0k views
ADD COMMENT
0
Entering edit mode

Were you able to get answer for your questions?

ADD REPLY
0
Entering edit mode

This is a GDC question, not a TCGAbiolinks question. For TCGA, GDC uses ASCAT2 (SNP6) and ASCATNGS (WGS) for integer value copy number. And gene level copy number is just intersect gene region with segmentation file, with some handling of edge cases. I am pretty sure these have been clearly described in the GDC documentation.

ADD REPLY

Login before adding your answer.

Traffic: 2004 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6