I am currently working on an algorithm to distinguish driver from passenger mutations. I have a set of new genes that are like a validation set to me. After I have run the algorithm on these genes, I will have a set of predicted drivers and passengers. My question is, exactly when can I call a gene driver based on the number of driver and passenger mutation I get? Intuition tells me, I can call a gene driver even if it has a single driver mutation. Am I correct? Or is there any cutoff?
You won't find any single answer to this. A gene driver, by definition, 'drives' / promotes tumour growth and / or progression. The term driver is used generally, though. A driver gene may have just a single point mutation, or may have no mutation at all and be regulated through other epigenetic means (or by a mutation in an intergenic regulatory region). So, going by the number of mutations is not the way to view this problem of driver gene identification.
Certain genes accumulate somatic mutations at higher frequencies than other genes, but these may have no role in tumour growth / progression.
For your algorithm, you will therefore have to include other features, such as epigenetic and other regulatory marks. Look up CADD in order to get a few ideas of what you could do.