Question: TCGA genotype data
gravatar for kelly.wang135
3.6 years ago by
Korea, Republic Of
kelly.wang13550 wrote:

Hi all,

I am interested in analyzing TCGA data and have been approved through dbGaP for the data access.

Question is: I want to use TCGA germline genotype data (In many published papers, they used Affymetrix 6 SNP data) but it seems that they do not provide array-based genotype data. Where can I find the genotype data of TCGA?

I have seen several similar questions, but couldn't find the right answer for me.

Thanks for your help!

tcga • 2.9k views
ADD COMMENTlink modified 3.6 years ago by i.sudbery9.1k • written 3.6 years ago by kelly.wang13550
gravatar for i.sudbery
3.6 years ago by
Sheffield, UK
i.sudbery9.1k wrote:

All data for TCGA is now accessed through the Genomic Data Commons data portal:

If you select data and then select the "Files" tab you can select "Genotyping Array" as the data type. However, it appears in the main data portal only copy number variation is available as having been called from the arrays.

If instead you select "legacy archive" from the portal front page, you can select "Files">"Genotyping Array">"Simple Nucleotide Variation".

Is there a reason why you want to use the array data? All the samples have been subjected to whole exome sequencing which should be better quality data.

ADD COMMENTlink written 3.6 years ago by i.sudbery9.1k

Thanks, so they were moved to archive. Actually, it will be better to use sequencing data of blood samples in vcf fomat. But when I chose "SNV" fot Data Category, it returned only "somatic mutations" files. So I thought I should use array-based genotype files!

But when I downloaded these "Annotated Somatic Mutation" files in vcf, they contain genotype column for both normal and tumor. So I guess I could use these files to get germline genotypes.

I do appreciate your answer again. It really helped!

ADD REPLYlink written 3.6 years ago by kelly.wang13550

Dear kelly.wang135, I am sorry for the late request for clarification - you did indeed find germline mutations in the vcf file from the harmonzied portal? As far as I understood, all germline mutations are already filtered out in the vcf files, no? In case I have overseen something, please let me know.

ADD REPLYlink written 2.0 years ago by susibing20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1194 users visited in the last hour