I have been using TCGA data via the TCGA data portal for a couple months (somatic mutation data in particular) and I would like to clear up some assumptions I have been making about the data. Specifically, I would like to know if their is any documentation from TCGA that explains the following:
1. What mutation calling techniques were used at each sequencing center?
2. What is the the difference between "mutation calling", "automated mutation calling", and "curated mutation calling"?
3. Are there any standard practices in terms of combining data from multiple sequencing centers? For example, is it recommended the only curated data sets are used because they will have the least false positives? Should data from "automated mutation calling" be avoided?