In the very least, you should do cross-validation (like leave-one-out-cross-validation) on a dataset. You can also apply the algorithm to other publicly available datasets (if they have metadata for the characteristic that you are trying to predict), which I think is a better test.
In both cases, you can use something like the ROCR package to create and ROC plot showing the tradeoffs between sensitivity and specificity. Creating a table with statistics like the positive predictive value and negative predictive value would also be nice. However, these are all relevant for binary variables - not sure if that is what you are trying to predict.
I stopped believing results from any classifier built from high-dimensional input data (like gene expression data sets) unless results are shown to replicate on a completely independent data set, ideally done by another research group. Cross validation is a minimum must-have, but even with it there is just too much data massaging and overfitting going on.
So if you have access to an independent data set, use it to assess the performance of your classifier before publishing, but be honest and don't cheat and tune your classifier afterwards to improve results. I know this sounds harsh, but the field has been pleagued by unreproducible and non-replicable results for too long.
Thanks for the suggestions. I will try them.