Last summer, TCGA data moved from TCGA Data Portal to the Genomic Data Commons. However, for some reason, some data is missing. At first, I thought it was a temporary glitch, but this hasn't changed over time.
For example, at GDAC Firebrowse, there are 194 TCGA-LAML 450K samples (2016-01-28 batch), but at GDC, there are 140 samples. There is also the GDC Legacy Archive for the legacy data, but that also contains 140 samples.
I've also seen similar discrepancies with RNA-seq data.
How can this be? This is a relatively old dataset. It should be stable. If there were any issue, I assume the samples would have been filtered years ago and it wouldn't be so many.