TCGA folder name
8.0 years ago
imagineyd ▴ 70

Hello everyone,

I have one question about the structure of TCGA folder name.

Recently, I'm working with genome-wide SNP datasets from TCGA, but I can't understand the rule of folder name.

For example, there are two different datatypes in ovarian.

broad.mit.edu_OV.Genome_Wide_SNP_6.Level_2.21.2002.0.tar.gz    2012-08-09 14:00  4.1G


I want to know

1. why the recent version is smaller than old version (they're mutual exclusive?)
2. what's the meaning of 21.2002 (or 22.1002)?

And if I want to analyze these kinds of dataset, I should download all the different versions of datasets? or recent or the biggest dataset?

Many thanks :)

