您的位置：首页 > 其它

the problem of overfitting

2017-03-01 15:17 459 查看

underfitting or high bias—hypothesis function h maps poorly to the trend of the data.

usually caused by a function that is too simple or uses too few features.

overfitting or high variance—fits the available data but does not generalize well to predict new data.

usually caused by a complicated function that creates a lot of unnecessary curves and angles unrelated to the data.

to address it:

1) Reduce the number of features:

1. Manually select which features to keep.

2. Use a model selection algorithm .

2) Regularization

1. Keep all the features: but reduce the magnitude of parameters θj.

2. Regularization works well when we have a lot of slightly useful features.

Regularization:

1.regularized linear regression

Without actually getting rid of these features or changing the form of our hypothesis, we can instead modify our cost function:

minθ12m[∑i=1m(hθ(x(i))−y(i))2+λ∑j=1nθ2j]

The λ, is the regularization parameter.

If λ is chosen to be too large, it may smooth out the function too much and cause underfitting.

As a result, we see that the new hypothesis (depicted by the pink curve) looks like a quadratic function but fits the data better due to the extra small terms θ

actually,(1−αλm)<1

so it shrink the parameter a little bit before do the same thing as previous.

Using regularization also takes care of any non-invertibility issues of the X transpose X matrix as well.

if m ≤ n, then XTX is non-invertible. However, when we add the term λ⋅L, then XTX+λ⋅L becomes invertible.

2.regularized logistic regression

the θ vector is indexed from 0 to n (holding n+1 values, θ0 through θn), and this sum explicitly skips θ0

b.t.w Because regularization causes J(θ) to no longer be convex, gradient descent may not always converge to the global minimum (when λ>0, and when using an appropriate learning rate α).

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航