您的位置:首页 > 其它

the problem of overfitting

2017-03-01 15:17 459 查看


underfitting or high bias—hypothesis function h maps poorly to the trend of the data.

usually caused by a function that is too simple or uses too few features.

overfitting or high variancefits the available data but does not generalize well to predict new data.

usually caused by a complicated function that creates a lot of unnecessary curves and angles unrelated to the data.

to address it:

1) Reduce the number of features:

1. Manually select which features to keep.

2. Use a model selection algorithm .

2) Regularization

1. Keep all the features: but reduce the magnitude of parameters θj.

2. Regularization works well when we have a lot of slightly useful features.

Regularization:

1.regularized linear regression

Without actually getting rid of these features or changing the form of our hypothesis, we can instead modify our cost function:

minθ12m[∑i=1m(hθ(x(i))−y(i))2+λ∑j=1nθ2j]

The λ, is the regularization parameter.

If λ is chosen to be too large, it may smooth out the function too much and cause underfitting.



As a result, we see that the new hypothesis (depicted by the pink curve) looks like a quadratic function but fits the data better due to the extra small terms θ



actually,(1−αλm)<1

so it shrink the parameter a little bit before do the same thing as previous.



Using regularization also takes care of any non-invertibility issues of the X transpose X matrix as well.



if m ≤ n, then XTX is non-invertible. However, when we add the term λ⋅L, then XTX+λ⋅L becomes invertible.

2.regularized logistic regression





the θ vector is indexed from 0 to n (holding n+1 values, θ0 through θn), and this sum explicitly skips θ0

b.t.w Because regularization causes J(θ) to no longer be convex, gradient descent may not always converge to the global minimum (when λ>0, and when using an appropriate learning rate α).
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: