Training SVM model...
Dataset Parameters
Model Parameters
Classification Report
No model trained yet.
Confusion Matrix
No model trained yet.
Model Metrics
Feature Scaling
Class Weight
Train-Test Split
Percentage of data to use for testingRandom State
Seed for reproducible resultsSupport Vector Machine (SVM) is a powerful supervised learning algorithm used for classification and regression.
- Finds optimal hyperplane to separate classes
- Maximizes margin between classes
- Uses kernel trick for non-linear data
- Effective in high-dimensional spaces
1 Hyperplane-Based Classification
SVM finds the optimal hyperplane that best separates different classes in the feature space. The hyperplane is chosen such that the distance from the nearest data points (support vectors) is maximized, leading to better generalization on unseen data.
2 Maximum Margin Principle
It chooses the hyperplane that maximizes the distance (margin) between classes, boosting generalization. The margin is defined as the distance between the hyperplane and the nearest data point from either class.
3 Support Vectors
Only the critical data points (support vectors) affect the decision boundary—super efficient! These are the points that lie closest to the decision surface and directly influence its position and orientation.
4 Works in High Dimensions
SVM shines when features > samples. Performs great in high-dimensional spaces due to its reliance on support vectors rather than all data points, making it effective for text classification and other high-dimensional problems.
5 Kernel Trick
Can model non-linear boundaries using kernel functions like Linear, Polynomial, RBF, and Sigmoid. The kernel trick allows SVMs to operate in a transformed feature space without explicitly computing the transformation.
Linear
Polynomial
RBF
Sigmoid
6 Regularization (C parameter)
Control overfitting vs. underfitting using the C hyperparameter. Low C → Smoother boundary (more regularization), High C → Fewer margin violations (less regularization).
7 Binary Classifier (Base Case)
SVM is inherently binary but can be extended to multi-class problems using One-vs-One (OvO) or One-vs-Rest (OvR) strategies.
N*(N-1)/2 classifiers
N classifiers
8 Robust to Outliers
Using soft margin and C parameter, SVM can handle noise while still maintaining a good boundary. The soft margin allows some misclassifications to prevent overfitting.
9 Computationally Heavy
Training is slow for large datasets or when using complex kernels. Optimization helps (like using LinearSVC for speed). Time complexity is typically O(n²) to O(n³).
10 Text & Image Applications
Excellent choice when you need high precision in tasks like spam detection, face recognition, or sentiment analysis. Particularly effective with TF-IDF features in text.
Try Our Interactive SVM Demo
Experience the power of SVM with our live demo. Adjust parameters and see the decision boundary change in real-time.
- Select a dataset from the dropdown
- Choose a kernel type
- Adjust the C and gamma parameters
- Click "Train Model" to see results
A Support Vector Machine (SVM) is a supervised machine learning algorithm that can be used for both classification and regression tasks. In classification, SVMs work by finding the optimal hyperplane that best separates different classes in the feature space with the maximum margin. The algorithm is particularly effective in high-dimensional spaces and is versatile through the use of different kernel functions.
SVMs are based on the idea of finding a decision boundary that maximizes the margin between the classes. The data points that are closest to the decision boundary are called support vectors, and they are the critical elements that define the position and orientation of the hyperplane.
SVM is particularly useful in the following scenarios:
- High-dimensional spaces: When the number of features is greater than the number of samples, SVMs perform exceptionally well (common in text classification).
- Clear margin of separation: When there's a clear margin of separation between classes, SVM can find the optimal boundary.
- Non-linear classification: With the kernel trick, SVM can handle non-linear decision boundaries effectively.
- Small to medium-sized datasets: SVMs work well with datasets that aren't extremely large (typically < 100,000 samples).
- Text classification and image recognition: SVMs are excellent for tasks like spam detection, sentiment analysis, and face recognition.
However, for very large datasets, other algorithms like neural networks or gradient boosting might be more efficient. SVM's training time scales quadratically with the number of samples.
| Scenario | Recommended Algorithm |
|---|---|
| Small/medium dataset, clear margin | SVM |
| Large dataset | Random Forest or Neural Network |
| Text classification | SVM or Naive Bayes |
SVMs can use different kernel functions to transform the input data into higher-dimensional spaces. The main kernel types are:
| Kernel | Formula | When to Use |
|---|---|---|
| Linear | K(x, y) = xᵀy + c | Linearly separable data |
| Polynomial | K(x, y) = (γxᵀy + r)d | Data with polynomial patterns |
| RBF (Gaussian) | K(x, y) = exp(-γ||x-y||²) | General purpose, default choice |
| Sigmoid | K(x, y) = tanh(γxᵀy + r) | Neural network-like behavior |
In practice, RBF is often the first choice as it can handle both linear and non-linear cases. The linear kernel is faster but only works for linearly separable data. Polynomial kernels can capture more complex relationships but require tuning of the degree parameter.
The C parameter in SVM is the regularization parameter that controls the trade-off between achieving a low training error and a low testing error (generalization). It's a crucial hyperparameter that affects the model's performance:
Small C (e.g., 0.1)
- Creates a wider margin
- More tolerant of misclassifications
- May lead to more training errors
- Better generalization (less overfitting)
Large C (e.g., 10)
- Creates a narrower margin
- Less tolerant of misclassifications
- May lead to fewer training errors
- Potentially worse generalization (more overfitting)
In this tool, you can adjust the C parameter using the slider and observe how it affects the decision boundary and the margin width. A good practice is to try values on a logarithmic scale (e.g., 0.001, 0.01, 0.1, 1, 10, 100).
The SVM visualization in this tool shows several important elements:
- Data points: The circles represent your data points, colored by their class (e.g., red and blue).
- Decision boundary: The line (or curve) that separates the two classes.
- Margin: The area between the dashed lines represents the margin that SVM tries to maximize.
- Support vectors: The data points that touch the margin lines are the support vectors (often marked with a border or different shape).
- Background color: The shaded regions show the classification areas for each class.
When you change parameters like the kernel type or C value, observe how these elements change:
- Kernel changes: See how different kernels create different decision boundaries (linear vs. curved)
- C parameter: Observe how increasing C makes the margin narrower and fits the training data more closely
- Gamma (for RBF): Higher gamma values create more complex, wiggly boundaries