22/04/2020

Support Vector Machine (SVM) is a widely used supervised learning algorithm for classification and regression tasks. It is mostly exploited for classification problems. The points of different classes are separated by a hyperplane, and this hyperplane must be chosen in such a way that the distances from it to the nearest data points on each side should be maximal. Support Vector Machine has some advantages. The first one is that SVM works well when you have highly dimensional space, for example, in text classification. Another SVM advantage is that it can be applied to nonlinear problems. This algorithm also has high accuracy and is less prone to overfitting due to the presence of a regularisation parameter.

Support Vector Machine (SVM) is a widely used supervised learning algorithm for classification and regression tasks. It is mostly exploited for classification problems. The points of different classes are separated by a hyperplane, and this hyperplane must be chosen in such a way that the distances from it to the nearest data points on each side should be maximal. Support Vector Machine has some advantages. The first one is that SVM works well when you have highly dimensional space, for example, in text classification. Another SVM advantage is that it can be applied to nonlinear problems. This algorithm also has high accuracy and is less prone to overfitting due to the presence of a regularisation parameter.

In this post we will apply SVM to an example. The task deals with the prediction of the current contraceptive method (No-use, Long-term, Short-term) of a woman based on her demographic and socio-economic characteristics. We will perform model tuning and explore the "Margin vs. misclassification trade-off". Also, we will visualize the results using the Matplotlib library.

For the implementation of the task mentioned above, we will use the Contraceptive Method Choice Data Set (https://archive.ics.uci.edu/ml/datasets/Contraceptive+Method+Choice). The following nine features will be used to predict the result:

- Wife's age
- Wife's education (1=low, 2, 3, 4=high)
- Husband's education (1=low, 2, 3, 4=high)
- Number of children ever born
- Wife's religion (0=Non-Islam, 1=Islam)
- Is Wife now working? (0=Yes, 1=No)
- Husband's occupation (1, 2, 3, 4)
- Standard-of-living index (1=low, 2, 3, 4=high)
- Media exposure (0=Good, 1=Not good)

The target will be the contraceptive method used (1=No-use, 2=Long-term, 3=Short-term)

In [39]:

```
# Import libraries
import numpy as np
import pandas as pd
# Upload the dataset
contraceptive = pd.read_csv('cmc.csv', names=['w_age', 'w_education', 'h_education', 'children', 'w_religion',
'w_working', 'h_occupation', 'standart_of_living', 'media_exposure',
'contraceptive_method'])
contraceptive.head()
```

Out[39]:

w_age | w_education | h_education | children | w_religion | w_working | h_occupation | standart_of_living | media_exposure | contraceptive_method | |
---|---|---|---|---|---|---|---|---|---|---|

0 | 24 | 2 | 3 | 3 | 1 | 1 | 2 | 3 | 0 | 1 |

1 | 45 | 1 | 3 | 10 | 1 | 1 | 3 | 4 | 0 | 1 |

2 | 43 | 2 | 3 | 7 | 1 | 1 | 3 | 4 | 0 | 1 |

3 | 42 | 3 | 2 | 9 | 1 | 1 | 3 | 3 | 0 | 1 |

4 | 36 | 3 | 3 | 8 | 1 | 1 | 3 | 2 | 0 | 1 |

In [40]:

```
# Create features and target
X = contraceptive[['w_age', 'w_education', 'h_education', 'children', 'w_religion', 'w_working',
'h_occupation', 'standart_of_living', 'media_exposure']]
y = contraceptive[['contraceptive_method']]
```

Now we can split our data into train and test subsets and train the model.

In [52]:

```
# Import libraries for Support Vector Machine
from sklearn.model_selection import train_test_split
from sklearn import svm
import warnings
warnings.filterwarnings('ignore')
# Split data onto train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=101)
# Train the model
model = svm.SVC(C=1, kernel='rbf', gamma=1)
model.fit(X_train, y_train)
```

Out[52]:

SVC(C=1, cache_size=200, class_weight=None, coef0=0.0, decision_function_shape='ovr', degree=3, gamma=1, kernel='rbf', max_iter=-1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False)

In [53]:

```
# Make prediction
prediction = model.predict(X_test)
# Get results
result = X_test
result['contraceptive'] = y_test
result['prediction'] = prediction.tolist()
result.head()
```

Out[53]:

w_age | w_education | h_education | children | w_religion | w_working | h_occupation | standart_of_living | media_exposure | contraceptive | prediction | |
---|---|---|---|---|---|---|---|---|---|---|---|

212 | 26 | 3 | 4 | 3 | 1 | 1 | 3 | 2 | 0 | 1 | 3 |

545 | 44 | 4 | 4 | 7 | 1 | 1 | 2 | 4 | 0 | 2 | 3 |

236 | 37 | 2 | 3 | 5 | 1 | 0 | 3 | 2 | 0 | 1 | 1 |

115 | 37 | 4 | 4 | 1 | 1 | 1 | 1 | 4 | 0 | 1 | 1 |

1163 | 41 | 3 | 4 | 3 | 0 | 1 | 2 | 4 | 0 | 1 | 2 |

Precision is given by the formula \begin{equation} p = {t_p/(t_p + f_p)} \end{equation} Recall \begin{equation} r = {t_p/(t_p + f_n)} \end{equation}

with $ t_p, f_p, f_n $ being number of true positives, false positives, and false negatives, respectively.

f1-score defines weighted harmonic mean of the precision and recall, 1 means the best score and 0 means the worst one.

Support is the number of occurrences of each class in y_true.

For example,

In [54]:

```
# Import necessary library and obtain classification report
from sklearn.metrics import classification_report
print(classification_report(result['contraceptive'], result['prediction']))
```

In [47]:

```
# Import necessary library
from sklearn.model_selection import GridSearchCV
# Get the best parameters for the model
parameters = {
'kernel': ['linear', 'rbf'],
'C': [0.1, 1, 10, 100],
'gamma': [0.001, 0.01, 0.1, 1]
}
gridforest = GridSearchCV(model, parameters, cv=3, n_jobs=-1, verbose=1)
gridforest.fit(X_train, y_train)
gridforest.best_params_
```

Fitting 3 folds for each of 32 candidates, totalling 96 fits

[Parallel(n_jobs=-1)]: Done 96 out of 96 | elapsed: 14.4s finished

Out[47]:

{'C': 10, 'gamma': 0.01, 'kernel': 'rbf'}

metric_b means before tuning {'C': 1, 'gamma': 1, 'kernel': 'rbf'}

metric_a means after tuning {'C': 10, 'gamma': 0.01, 'kernel': 'rbf'}

```
precision_b precision_a recall_b recall_a f1-score_b f1-score_a support_b support_a
1 0.56 0.70 0.78 0.58 0.65 0.63 169 169
2 0.50 0.43 0.20 0.30 0.28 0.36 76 76
3 0.49 0.48 0.40 0.68 0.44 0.56 124 124
avg / total 0.52 0.57 0.53 0.56 0.50 0.55 369 369
```

Now, let's visualize some results.

In [55]:

```
# Create a mesh of points to plot in
def make_meshgrid(x, y, h = .02):
x_min, x_max = x.min() - 1, x.max() + 1
y_min, y_max = y.min() - 1, y.max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
return xx, yy
# Plot the decision boundaries for a classifier
def plot_contours(ax, model, xx, yy, **params):
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
out = ax.contourf(xx, yy, Z, **params)
return out
```

In [56]:

```
# Import library for visualization
import matplotlib.pyplot as plt
# Take two defined features
X = contraceptive[['w_age', 'children']]
# Split data into train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=101)
# Train the model
model = svm.SVC(C=10, kernel='rbf', gamma=0.01)
model.fit(X_train, y_train)
```

Out[56]:

SVC(C=10, cache_size=200, class_weight=None, coef0=0.0, decision_function_shape='ovr', degree=3, gamma=0.01, kernel='rbf', max_iter=-1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False)

In [57]:

```
# Visualize
X0, X1 = X_test['w_age'], X_test['children']
xx, yy = make_meshgrid(X0, X1)
plot_contours(plt, model, xx, yy, cmap=plt.cm.RdYlGn, alpha=0.8)
plt.scatter(X0, X1, c = y_test['contraceptive_method'], cmap=plt.cm.RdYlGn, s=20, edgecolors='k')
# Highlight support vectors
sv_indices = model.support_
plt.scatter(X0[sv_indices], X1[sv_indices], color='white', alpha=0.15, s=100)
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.xlabel('Age')
plt.ylabel('Children')
plt.title('SVM with RBF kernel')
plt.show()
```

In [58]:

```
# Take two defined features
X = contraceptive[['w_age', 'children']]
# Split data into train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=101)
# Regularisation parameter
Cs = [0.1, 1, 10, 100]
# Visualize results for the different values of regularisation parameter
i = 1
plt.subplots_adjust(left=0.1, right=2, bottom=0.1, top=1, wspace=0.4, hspace=0.6)
for c in Cs:
model = svm.SVC(C=c, kernel='rbf', gamma=0.01)
model.fit(X_train, y_train)
title = 'SVM with RBF kernel C=' + str(c)
plt.subplot(2, 2, i)
X0, X1 = X_test['w_age'], X_test['children']
xx, yy = make_meshgrid(X0, X1)
plot_contours(plt.subplot(2, 2, i), model, xx, yy, cmap=plt.cm.RdYlGn, alpha=0.8)
plt.subplot(2, 2, i).scatter(X0, X1, c=y_test['contraceptive_method'], cmap=plt.cm.RdYlGn,
s=20, edgecolors='k')
# Highlight support vectors
sv_indices = model.support_
plt.subplot(2, 2, i).scatter(X0[sv_indices], X1[sv_indices], color='white', alpha=0.15, s=100)
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.xlabel('Age')
plt.ylabel('Children')
plt.title(title)
i = i + 1
plt.show()
```

Share: