您的位置:首页 > 其它

RBF SVM中的gamma和C参数

2015-11-09 13:42 211 查看
This example illustrates the effect of the parameters gamma and C of
the Radial Basis Function (RBF) kernel SVM.

Intuitively, the gamma parameter defines how far the influence of a single training example reaches, with
low values meaning ‘far’ and high values meaning ‘close’. The gamma parameters can be seen as the inverse
of the radius of influence of samples selected by the model as support vectors.

The C parameter trades off misclassification of training examples against simplicity of the decision surface.
A low C makes the decision surface smooth, while a high C aims
at classifying all training examples correctly by giving the model freedom to select more samples as support vectors.

The first plot is a visualization of the decision function for a variety of parameter values on a simplified classification problem involving only 2 input features and 2 possible target classes (binary classification). Note that this kind of plot is not possible
to do for problems with more features or target classes.

The second plot is a heatmap of the classifier’s cross-validation accuracy as a function of C and gamma.
For this example we explore a relatively large grid for illustration purposes. In practice, a logarithmic grid from

to

is
usually sufficient. If the best parameters lie on the boundaries of the grid, it can be extended in that direction in a subsequent search.

Note that the heat map plot has a special colorbar with a midpoint value close to the score values of the best performing models so as to make it easy to tell them appart in the blink of an eye.

The behavior of the model is very sensitive to the gamma parameter. If gamma is
too large, the radius of the area of influence of the support vectors only includes the support vector itself and no amount of regularization with C will
be able to prevent overfitting.

When gamma is very small, the model is too constrained and cannot capture the complexity or “shape” of the
data. The region of influence of any selected support vector would include the whole training set. The resulting model will behave similarly to a linear model with a set of hyperplanes that separate the centers of high density of any pair of two classes.

For intermediate values, we can see on the second plot that good models can be found on a diagonal of C and gamma.
Smooth models (lower gamma values) can be made more complex by selecting a larger number of support vectors
(larger Cvalues) hence the diagonal of good performing models.

Finally one can also observe that for some intermediate values of gamma we get equally performing models
when Cbecomes very large: it is not necessary to regularize by limiting the number of support vectors. The
radius of the RBF kernel alone acts as a good structural regularizer. In practice though it might still be interesting to limit the number of support vectors with a lower value of C so
as to favor models that use less memory and that are faster to predict.

We should also note that small differences in scores results from the random splits of the cross-validation procedure. Those spurious variations can be smoothed out by increasing the number of CV iterations n_iter at
the expense of compute time. Increasing the value number of C_range and gamma_range steps
will increase the resolution of the hyper-parameter heat map.









Script output:

The best parameters are {'C': 1.0, 'gamma': 0.10000000000000001} with a score of 0.97

原文链接为http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: