Question: How can I obtain SNP from TCGA?
gravatar for purmod
5.9 years ago by
United States
purmod10 wrote:

I have a list of SNPs associated with breast cancer(BRCA).

I'm trying to find the number of risky or risk alleles for each SNP of each patient.

What kinds of data do I have to download?

Here are the list of data types supplied by TCGA: (please click on "Expand All" to see the whole data types)


snp brca risky allele • 6.7k views
ADD COMMENTlink modified 5.9 years ago by akojic0 • written 5.9 years ago by purmod10

Do you want somatic variants, or germline?  If germline, you will need to put in a data access request through dbGaP since the germline data are restricted access.

ADD REPLYlink written 5.9 years ago by Sean Davis26k

Sean, if not germline then what type of data would answer the question? I understood the SNP data is actually copy number. Thanks!

ADD REPLYlink written 5.9 years ago by juliacsd0

These are the possibilities, I think: 1) Germline SNVs derived from WGS/WXS, 2) somatic SNVs derived from WGS/WXS, 3) SNP genotypes from arrays, and 4) copy number from SNP arrays.  Choices 1 and 3 will require a dbGaP access request.

ADD REPLYlink written 5.9 years ago by Sean Davis26k

Thanks, Sean! What I'm trying to do is as follows.

For example, rs11249433 is known to be associated with breast cancer, and G is known to be a risk allele for rs11249433. Thus, what I have to do is to find genotype according to rs11249433, and just count the number of Gs. In this case, I think I need 3) from the above 4 possibilities. Right? I've already approved for the access to dbGaP. Could you explain how to get 3) in more detail please? I'm digging into the CGHub website, but it's a little challenging work for me since I'm not familiar with this field.

Thank you again!

ADD REPLYlink written 5.9 years ago by purmod10
gravatar for akojic
5.9 years ago by
akojic0 wrote:

I think I have the similar problem.

I would also like to know how I can accesss the SNP genotyping data for germline variants, preferably non coding variants? Which datatype and data level we are talking about.

I applied and I am  granted the access to controlled-access datasets.


ADD COMMENTlink written 5.9 years ago by akojic0

I think the original poster included a link to the descriptions of data types.  That is where to look.  

As an aside, it is best to simply leave a comment (or simply ask another question directly) rather than ask another question in the same thread.

ADD REPLYlink written 5.9 years ago by Sean Davis26k

I looked there many times before I came here to ask for help which means you are also not familiar with particular dataset, just with the table of datasets and data levels. But thanks anyways.

ADD REPLYlink written 5.9 years ago by akojic0

The data levels are available under "Copy Number" for SNP arrays and germline sequencing is listed under "DNA Sequencing".  Sequence data is available via cgHub and SNP array data is available via the TCGA data portal.  Both require that you have obtained controlled access privileges.  

ADD REPLYlink written 5.9 years ago by Sean Davis26k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 711 users visited in the last hour