Question: TCGA SNP arrays
0
gravatar for joan.frigola
3.3 years ago by
joan.frigola0 wrote:

I am interested on using the TCGA SNP arrays data. What I would like to get is a list of the snps per sample and their position, chromosome, reference and alternate alleles. This would be something like:

sample_id - chromosome - position - reference - alternate

Looking at the TCGA data portal I have found a series of files called genotype.dat (there's one per sample) that contain the following information:

Composite_element_ref chromosome Physical position Genotype
SNP_A-8575115 2 533321 AB

*Fake data

I have assumed that the first column is some kind of id, the second is the chromosome and the third is the position. However I am not sure about the meaning of the forth column.

The possible options that you can find on it are (AA, AB, BB or NC). Does this mean homozygote, heterozygote, not computed? How could I map this SNPS to the actual nucleotides that are being changed (for example C -> T)?

Thanks a lot in advance,

Joan

 

 

 

snp tcga • 1.2k views
ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by joan.frigola0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1428 users visited in the last hour