I have obtained exome sequencing data from 10 normal-tumor samples from patients suffering from pancreatic cancer (PDAC). I am interested in finding the distribution of mutations (single base substitutions specifically) in different parts of the genome (both coding and non-coding). Is there a computational approach to do that?
This is called
topography of mutations. There are not many papers on this aspect of the mutations. One way is to follow this paper. You will get more insights if you do mutational signature analysis first and do genome distribution of these signatures.
You can download matching cell-type epigenomic data from ENCODE and create genome features for yourself such as open/closed chromatin (or from ENCODE chromHMM results) and check if mutations are skewed to any of these genomic areas or do mutational processes have any effect on genome structure (or vice versa; both are interdependent in cancer).
PS: As you say it's whole exome data, I doubt there will be many non-coding mutations unless there is some off-target capturing.