Svm-Predict Input File Format
2
4
Entering edit mode
14.0 years ago
Panos ★ 1.8k

I'm trying to classify reads using libSVM (tetramer frequencies).

I have a trained model but I can't find what the input file format for svm-predict should be; the sequences that will be of unknown origin shouldn't have a label in the beginning of the vector. If I don't put one, then svm-predict prints out "Classification=..." as if it was doing testing of the model and I think that there should be a way to "tell" svm-predict that you're not doing testing but "actual" prediction...

I'm new to libSVM, so please tell me if I'm wrong at some point...

short classification metagenomics • 14k views
ADD COMMENT
4
Entering edit mode
14.0 years ago

LIBSVM contains 3 programs for three specific applications:

  1. svm-train : Use this program for training your data with class labels.

  2. svm-predict : Once you generate the model use svm-predict with feature vectors as input (no class labels required, the svm-predict with use the model and the input feature vectors and predict the class

  3. svm-scale : This is important important to avoid feature bias. This can be used to scale data to a restricted range as preprocessing for SVM training

I understand that you have already created your model and you are having problem with input file format. This should be in sync with the input files that you have used to generate the model. Usually input file will be a text file with the features derived from sequences.

If you are looking for a tutorial on libsvm, the official tutorial and FAQ are the best.

ADD COMMENT
1
Entering edit mode

Does svm-predict ALWAYS output the "Accuracy=xx%" line? In my case it does and it looks like it does testing (and not prediction of unknown data).

If I put no value in the beginning of the line, then it parses the first integer (in my case this is the index of the index:value pair) of each line and gives a non-sense accuracy percentage...

ADD REPLY
0
Entering edit mode

Looks like some issue with your svm-predict, which version version are you using ?

ADD REPLY
0
Entering edit mode

Looks like some issue with your svm-predict, which version are you using ?

ADD REPLY
0
Entering edit mode

I'm using version 2.91...

ADD REPLY
0
Entering edit mode

I thought you were asking due to some problems with your model or issues with input file. Glad that you got proper response from libSVM authors.

ADD REPLY
2
Entering edit mode
14.0 years ago
Panos ★ 1.8k

I emailed libSVM's author and I thought it would be good to share with you the answer to my question...

He told me that when you're doing the "actual" prediction, you just put random numbers as labels. It will still print out the "Accuracy=..." statement, which will, of course, be meaningless; the only thing that matters is svm-predict's output file containing the classification results.

See also the following Q&A from libsvm faq:

Q: I don't know class labels of test data. What should I put in the first column of the test file?

A: Any value is ok. In this situation, what you will use is the output file of svm-predict, which gives predicted class labels.

ADD COMMENT
0
Entering edit mode

Thanks for posting the answer back to BioStar. This will be useful for future references.

ADD REPLY

Login before adding your answer.

Traffic: 3340 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6