您的位置:首页 > 其它

机器学习基石 - Learning to Answer Yes/No

2018-02-28 22:37 483 查看
机器学习基石上 (Machine Learning Foundations)—Mathematical Foundations

Hsuan-Tien Lin, 林轩田,副教授 (Associate Professor),资讯工程学系 (Computer Science and Information Engineering)

Learning to Answer Yes/No

A A takes D D and H H to get g g

Perceptron(感知器) Hypothesis Set

综合各个参数来得出一个分数

x=(x 1 ,x 2 ,⋯,x d ) x=(x1,x2,⋯,xd) —— features of customer

各个参数(维度)乘上相应的权重再相加

approve credit if ∑ d i=1 w i x i >threshold ∑i=1dwixi>threshold

deny credit if ∑ d i=1 w i x i <threshold ∑i=1dwixi<threshold

Y:{+1(good),−1(bad)} Y:{+1(good),−1(bad)},(0 - ignored)

linear formula h∈H h∈H is h(x)=sign((∑ d i=1 w i x i )−threshold) h(x)=sign((∑i=1dwixi)−threshold)

简化一些:令 w 0 =−threshold,x 0 =1 w0=−threshold,x0=1

h(x)=sign(∑ d i=0 w i x i )=sign(w T x) h(x)=sign(∑i=0dwixi)=sign(wTx) (向量内积)

each w w represents a hypothesis h h,不同的参数对应不同的函数

Perceptron in R 2 R2

二维感受器



不同的分类(参数)有不同的效果

令 h=0 h=0,得到的几何图形是一条线,线性分类器

Perceptron Learning Algorithm (PLA)

H H includes all possible perceptrons (infinite), how to select g g ?

want、necessary、difficult、idea



what we want: g≈f g≈f (hard when f f is unknown)

可行的是在已知的数据里,理想情况下使得 g(x n )=f(x n )=y n g(xn)=f(xn)=yn

先有一条线 g 0 g0,再慢慢改进修正参数 w 0 w0

步骤



向量内积的正负可以通过夹角判断

修正向量,改变夹角

A fault confessed is half redressed. (知错能改善莫大焉)

Cyclic PLA

a full cycle of not encountering mistakes

‘correct’ mistakes on D D until no mistakes

find the next mistake: follow naive cycle or precomputed random cycle

存在的问题

循环一定会中止吗

得到的 g g 和所设想的 f f 究竟接近吗

数据之外的表现如何

思考题


注意第二个选项

Guarantee of PLA

if PLA halts (no more mistakes)

(necessary condition) D D allows some w w to make no mistake

call such D D linear separable (线性可分)

linear separable D D ⇔ exists perfect w f wf such that y n =sign(w T f x n ) yn=sign(wfTxn)

证明1



向量内积的操作是通过矩阵乘法实现的

w t wt gets more aligned with w f wf (因为内积变大)

已知式



证明2



w t wt does not grow too fast (长度增量有上界)

w t wt 和 w f wf 的夹角会越来越小,存在下界 0 度

思考题



Non-Separable Data

linear separable: inner product of w f wf and w t wt grows fast (二者越来越接近)

correct by mistake: length of w t wt grows slowly (缓慢增长)

PLA ‘lines’ are more and more aligned with w f wf ⇒ halts

Pros: simple to implement, fast, works in any dimension

Cons

‘assumes’ linear separable D D to halt (只是假设线性可分)

not fully sure how long halting takes (何时停止不知道)

Learning with Noisy Data

找一条犯错误最少的线

公式



括号代表boolean运算

argmin f(x)
是指使得函数 f(x) 取得其最小值的所有自变量 x 的集合

NP-hard to solve

Pocket Algorithm

modify PLA algorithm (black lines) by keeping best weights in pocket (总是取当前情况下最好的)

算法



a simple modification of PLA to find(somewhat) ‘best’ weights

在线性可分的数据集上使用 Pocket 也能找到最优解,但会比 PLA 慢

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: