I need to compare the derived allele frequncy spetrum of my studied mutations with the synonymous SNPs. The number of my studied mutations is very small, only 14, while the number of genomwide synonymous SNPs is up to 1 million. Therefore these two sample size are largely different and directly applied Mann–Whitney U test to them is of course no significant difference. How do I know that the non-significant result is due to small sample size or due to that these two samples are truely no difference?
You could perform permutation testing: select 14 SNPs at random from those 1mln, say 10000 times and build the histogram of their allele frequency means / medians. The number of times you get a larger allele frequency divided by 10000 will be the P-value.
You can also build the allele frequency distribution for all your SNPs and see how large the allele frequencies of your 14 SNPs are in respect to it.
Visualizing allele frequency distributions could give you some insight on what is happening in your data.
Therefore these two sample size are largely different and directly applied Mann–Whitney U test to them is of course no significant difference.
I don't think the lack of significant difference is due to the different sample sizes, why should it be the case? (In fact I don't see the point in resampling SNPs from the large set). Rather, the small dataset reduces power so much that the difference you see is non significant.
How do I know that the non-significant result is due to small sample size or due to that these two samples are truely no difference?
These are two sides of the same coin. The difference you observe is not significant because the sample size is not large enough. With huge sample sizes even tiny differences would produce very small p-values, in that case the question would be "Is this difference biologically meaningful?"
This is to illustrate the point. Produce two sets differing by small amount. The p-value for the difference is highly significant since the sample sizes are large. If you downsample one set the difference is no longer significant:
set.seed(1) set1<- rbeta(n= 10000, 10, 10) set.seed(2) set2<- rbeta(n= 10000, 10, 9.5)
Difference btw set 1 and 2 is significant even if the difference is small:
mean(set1); mean(set2)  0.5006102  0.5112549 wilcox.test(set1, set2) # p-value = 1.023e-11 # Now reduce one set to 14 obs: set.seed(3) wilcox.test(sample(set1, size= 14), set2) # p-value = 0.4413