I have one question about the structure of TCGA folder name.
Recently, I'm working with genome-wide SNP datasets from TCGA, but I can't understand the rule of folder name.
For example, there are two different datatypes in ovarian.
broad.mit.edu_OV.Genome_Wide_SNP_6.Level_2.21.2002.0.tar.gz 2012-08-09 14:00 4.1G broad.mit.edu_OV.Genome_Wide_SNP_6.Level_2.22.1002.0.tar.gz 2010-06-21 08:00 7.3G
I want to know
- why the recent version is smaller than old version (they're mutual exclusive?)
- what's the meaning of 21.2002 (or 22.1002)?
And if I want to analyze these kinds of dataset, I should download all the different versions of datasets? or recent or the biggest dataset?
Many thanks :)