机器学习基石 - Learning to Answer Yes/No
2018-02-28 22:37
483 查看
机器学习基石上 (Machine Learning Foundations)—Mathematical Foundations
Hsuan-Tien Lin, 林轩田,副教授 (Associate Professor),资讯工程学系 (Computer Science and Information Engineering)
x=(x 1 ,x 2 ,⋯,x d ) x=(x1,x2,⋯,xd) —— features of customer
各个参数(维度)乘上相应的权重再相加
approve credit if ∑ d i=1 w i x i >threshold ∑i=1dwixi>threshold
deny credit if ∑ d i=1 w i x i <threshold ∑i=1dwixi<threshold
Y:{+1(good),−1(bad)} Y:{+1(good),−1(bad)},(0 - ignored)
linear formula h∈H h∈H is h(x)=sign((∑ d i=1 w i x i )−threshold) h(x)=sign((∑i=1dwixi)−threshold)
简化一些:令 w 0 =−threshold,x 0 =1 w0=−threshold,x0=1
h(x)=sign(∑ d i=0 w i x i )=sign(w T x) h(x)=sign(∑i=0dwixi)=sign(wTx) (向量内积)
each w w represents a hypothesis h h,不同的参数对应不同的函数
Perceptron in R 2 R2
二维感受器
不同的分类(参数)有不同的效果
令 h=0 h=0,得到的几何图形是一条线,线性分类器
want、necessary、difficult、idea
what we want: g≈f g≈f (hard when f f is unknown)
可行的是在已知的数据里,理想情况下使得 g(x n )=f(x n )=y n g(xn)=f(xn)=yn
先有一条线 g 0 g0,再慢慢改进修正参数 w 0 w0
步骤
向量内积的正负可以通过夹角判断
修正向量,改变夹角
A fault confessed is half redressed. (知错能改善莫大焉)
Cyclic PLA
a full cycle of not encountering mistakes
‘correct’ mistakes on D D until no mistakes
find the next mistake: follow naive cycle or precomputed random cycle
存在的问题
循环一定会中止吗
得到的 g g 和所设想的 f f 究竟接近吗
数据之外的表现如何
思考题
注意第二个选项
(necessary condition) D D allows some w w to make no mistake
call such D D linear separable (线性可分)
linear separable D D ⇔ exists perfect w f wf such that y n =sign(w T f x n ) yn=sign(wfTxn)
证明1
向量内积的操作是通过矩阵乘法实现的
w t wt gets more aligned with w f wf (因为内积变大)
已知式
证明2
w t wt does not grow too fast (长度增量有上界)
w t wt 和 w f wf 的夹角会越来越小,存在下界 0 度
思考题
correct by mistake: length of w t wt grows slowly (缓慢增长)
PLA ‘lines’ are more and more aligned with w f wf ⇒ halts
Pros: simple to implement, fast, works in any dimension
Cons
‘assumes’ linear separable D D to halt (只是假设线性可分)
not fully sure how long halting takes (何时停止不知道)
公式
括号代表boolean运算
NP-hard to solve
Pocket Algorithm
modify PLA algorithm (black lines) by keeping best weights in pocket (总是取当前情况下最好的)
算法
a simple modification of PLA to find(somewhat) ‘best’ weights
在线性可分的数据集上使用 Pocket 也能找到最优解,但会比 PLA 慢
Hsuan-Tien Lin, 林轩田,副教授 (Associate Professor),资讯工程学系 (Computer Science and Information Engineering)
Learning to Answer Yes/No
A A takes D D and H H to get g gPerceptron(感知器) Hypothesis Set
综合各个参数来得出一个分数x=(x 1 ,x 2 ,⋯,x d ) x=(x1,x2,⋯,xd) —— features of customer
各个参数(维度)乘上相应的权重再相加
approve credit if ∑ d i=1 w i x i >threshold ∑i=1dwixi>threshold
deny credit if ∑ d i=1 w i x i <threshold ∑i=1dwixi<threshold
Y:{+1(good),−1(bad)} Y:{+1(good),−1(bad)},(0 - ignored)
linear formula h∈H h∈H is h(x)=sign((∑ d i=1 w i x i )−threshold) h(x)=sign((∑i=1dwixi)−threshold)
简化一些:令 w 0 =−threshold,x 0 =1 w0=−threshold,x0=1
h(x)=sign(∑ d i=0 w i x i )=sign(w T x) h(x)=sign(∑i=0dwixi)=sign(wTx) (向量内积)
each w w represents a hypothesis h h,不同的参数对应不同的函数
Perceptron in R 2 R2
二维感受器
不同的分类(参数)有不同的效果
令 h=0 h=0,得到的几何图形是一条线,线性分类器
Perceptron Learning Algorithm (PLA)
H H includes all possible perceptrons (infinite), how to select g g ?want、necessary、difficult、idea
what we want: g≈f g≈f (hard when f f is unknown)
可行的是在已知的数据里,理想情况下使得 g(x n )=f(x n )=y n g(xn)=f(xn)=yn
先有一条线 g 0 g0,再慢慢改进修正参数 w 0 w0
步骤
向量内积的正负可以通过夹角判断
修正向量,改变夹角
A fault confessed is half redressed. (知错能改善莫大焉)
Cyclic PLA
a full cycle of not encountering mistakes
‘correct’ mistakes on D D until no mistakes
find the next mistake: follow naive cycle or precomputed random cycle
存在的问题
循环一定会中止吗
得到的 g g 和所设想的 f f 究竟接近吗
数据之外的表现如何
思考题
注意第二个选项
Guarantee of PLA
if PLA halts (no more mistakes)(necessary condition) D D allows some w w to make no mistake
call such D D linear separable (线性可分)
linear separable D D ⇔ exists perfect w f wf such that y n =sign(w T f x n ) yn=sign(wfTxn)
证明1
向量内积的操作是通过矩阵乘法实现的
w t wt gets more aligned with w f wf (因为内积变大)
已知式
证明2
w t wt does not grow too fast (长度增量有上界)
w t wt 和 w f wf 的夹角会越来越小,存在下界 0 度
思考题
Non-Separable Data
linear separable: inner product of w f wf and w t wt grows fast (二者越来越接近)correct by mistake: length of w t wt grows slowly (缓慢增长)
PLA ‘lines’ are more and more aligned with w f wf ⇒ halts
Pros: simple to implement, fast, works in any dimension
Cons
‘assumes’ linear separable D D to halt (只是假设线性可分)
not fully sure how long halting takes (何时停止不知道)
Learning with Noisy Data
找一条犯错误最少的线公式
括号代表boolean运算
argmin f(x)是指使得函数 f(x) 取得其最小值的所有自变量 x 的集合
NP-hard to solve
Pocket Algorithm
modify PLA algorithm (black lines) by keeping best weights in pocket (总是取当前情况下最好的)
算法
a simple modification of PLA to find(somewhat) ‘best’ weights
在线性可分的数据集上使用 Pocket 也能找到最优解,但会比 PLA 慢
相关文章推荐
- 台湾大学林轩田机器学习基石课程学习笔记2 -- Learning to Answer Yes/No
- [MOOC学习笔记]机器学习基石 Lecture02 Learning to Answer Yes/No
- 机器学习基石笔记 Lecture 2: Learning to Answer Yes/No
- 【笔记】机器学习基石(二)learning to answer yes or no
- 林轩田-机器学习基石 课堂笔记(二)Learning to Answer Yes/No
- 机器学习基石笔记(2)——Learning to Answer Yes/No
- 机器学习基石notes-Lecture2 Learning to Answer Yes/No
- 机器学习基石-Learning to Answer Yes/No
- 干货 | 机器学习基石02 Learning to Answer Yes/No
- 机器学习基石-2-Learning to Answer Yes/No
- 2. 机器学习基石-When can Machine Learn? - Learning to Answer Yes or No
- 机器学习基石第二讲 Learning to Answer Yes/No
- 《机器学习基石》台大林轩田_学习笔记02_Learning to Answer Yes/No
- 机器学习基石-04-3-Connection to Learning
- 机器学习基石第二讲:learn to answer yes/no
- 《机器学习基石》2-Learning to Answer Yes/No
- 台湾大学林轩田机器学习基石课程学习笔记4 -- Feasibility of Learning
- 机器学习基石 4-1 Learning is impossible
- 机器学习基石 - The Learning Problem
- 程序员学习机器学习之建议What would be your advice to a software engineer who wants to learn machine learning?