I'm trying to use the vcf2maf tool to convert TCGA MAF files into VCF format, but have been stuck for days trying to obtain the appropriate reference fasta file for the --ref flag.
I've tried pointing it to vep fasta files, downloading directly from ensembl, and getting it from NCBI and indexing myself with samtools. I've also tried unzipping, or rezipping with bzip, but I still get the same error with many lines complaining about not being able to fetch certain sequences:
[W::fai_get_val] Reference chr12:52568256-52568258 not found in file, returning empty sequence [faidx] Failed to fetch sequence in chr12:52568256-52568258 ERROR: Make sure that ref-fasta is the same genome build as your MAF: genome-assemblies/homo_sapiens/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
The command that I'm running is perl ../tools/vcf2maf/maf2vcf.pl --input-maf ./03652df4-6090-4f5a-a2ff-ee28a37f9301/TCGA.COAD.mutect.03652df4-6090-4f5a-a2ff-ee28a37f9301.DR-10.0.somatic.maf --ref genome-assemblies/homo_sapiens/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz --output-dir TEST1 and the input MAF is here. The reference genome on this page says GRCh38.p0. How can I get an appropriate fasta reference for the vcf2maf tool?