Hii... I am new for GEO datasets. I got the a cancerous data from GEO datasets and i want to perform the SVM on that dataset. So I have 2 questions:
1. The downloaded dataset are normalized or I have to normalize??
2. Before performing SVM, I have to perform PCA??
Please put in a bit more effort. How should we advise on datasets that you don't even link? Thanks for clarifying the abbreviations. It is generally good practice to introduce abbreviations unless they are 100% common knowledge like DNA:
Because SVMs expand and shrink the data, they don't work well with features that contain large numbers. It is customary to scale all features to [-1, +1] or [0, 1] ranges, whether they are normalized or not.
You do not have to perform PCA before SVM as a general rule. It may help if you have lots of features (or data points), because SVM is usually single-threaded and therefore very slow. In fact, I would suggest that you try random forests or boosted decision trees as they will be much faster, and it is easier to tune their hyperparameters. Even though SVMs have only 2 hyperparameters to tune versus half a dozen or so for boosted trees, the latter will still be faster and probably more accurate as well.
What is SVM? Which dataset?
SVM stands for support vector machine RNA Expression dataset PCA principle component analysis
Please put in a bit more effort. How should we advise on datasets that you don't even link? Thanks for clarifying the abbreviations. It is generally good practice to introduce abbreviations unless they are 100% common knowledge like DNA: