Hi. Ideally, I would like to get a single VCF file from all the exomes sequences that TCGA has from all cancer types. Even more ideally, I would do this for only a certain region in the genome. Is there any way to do this? I have GDC-client downloaded and loaded in the command line at the moment, but can only seem to find UUIDs for individual cancer types.
TCGA VCF files are not available as open access - only MAF (mutation annotation format) files are available, and these can be downloaded from the GDC (Genomic Data Commons) Data Portal.
You can search for functions online about how to convert MAF to VCF, if that is definitely what you need.
If you keep everything as MAF, which is essentially tab-delimited format, then you can simply use shell commands to merge everything together. If you convert the data to multiple VCFs, then you can use BCFtools to merge them.