I have a number of exome (and some wgs) sequences of tumors with no matched blood sequence data. In order to call somatic mutations, the approach that I'm taking is to compare the tumor sequence to the human reference genome, filter out variants that are known SNP sites, and assume the rest are somatic mutations. Not perfect, but reasonable.
The problem I'm having is finding a control .bam file to use with mutation callers like MuTect, SomaticSnpier, etc. Basically, I need a .bam file corresponding to the reference human genome (mostly assembly hg19 for the tumor data) to compare to the tumor .bams, but I don't know how to go about creating one from the reference fasta in the absence of read coordinates. Is there a straightforward way to get an input bam file that uses the consensus sequence of hg19?