Hi,
I'm a data scientist trying to perform predictive modeling on clinical trial patients based on TMB. However, the data provided was raw fast-q files. We have succeeded in generating BAM alignment files. How do I measure non-synonymous mutations per Mb in these files? Is there any R package I can use for the same?
You will have to do variant calling (please google variant callers, there are plenty of them) followed by variant annotation. Tools like
VEP
,Annovar
,snpeff
come to mind. Then filter for non-syn. variants. No, there is no all-in-one tool in R that does that directly from BAM files.Prior thread that may be of interest:
How to calculate Tumor Mutation Burden (TMB) for TCGA samples