【转载】所有的机器学习模型都有缺陷(by John Langford)
2010-08-10 21:30
537 查看
Attempts to abstract and study machine learning are within some given framewok or mathematical model. It turns out that all of these models are significantly flawed for the purpose of studying machine learning. I’ve created a table (below) outlining the major flaws in some common models of machine learning.
The point here is not simply “woe unto us”. There are several implications which seem important.
1. The multitude of models is a point of continuing confusion. It is common for people to learn about machine learning within one framework which often becomes there “home framework” through which they attempt to filter all machine learning. (Have you met people who can only think in terms of kernels? Only via Bayes Law? Only via PAC Learning?) Explicitly understanding the existence of these other frameworks can help resolve the confusion. This is particularly important when reviewing and particularly important for students.
2. Algorithms which conform to multiple approaches can have substantial value. “I don’t really understand it yet, because I only understand it one way”. Reinterpretation alone is not the goal—we want algorithmic guidance.
3. We need to remain constantly open to new mathematical models of machine learning. It’s common to forget the flaws of the model that you are most familiar with in evaluating other models while the flaws of new models get exaggerated. The best way to avoid this is simply education.
4. The value of theory alone is more limited than many theoreticians may be aware. Theories need to be tested to see if they correctly predict the underlying phenomena.
Here is a summary what is wrong with various frameworks for learning. To avoid being entirely negative, I added a column about what’s right as well.
The point here is not simply “woe unto us”. There are several implications which seem important.
1. The multitude of models is a point of continuing confusion. It is common for people to learn about machine learning within one framework which often becomes there “home framework” through which they attempt to filter all machine learning. (Have you met people who can only think in terms of kernels? Only via Bayes Law? Only via PAC Learning?) Explicitly understanding the existence of these other frameworks can help resolve the confusion. This is particularly important when reviewing and particularly important for students.
2. Algorithms which conform to multiple approaches can have substantial value. “I don’t really understand it yet, because I only understand it one way”. Reinterpretation alone is not the goal—we want algorithmic guidance.
3. We need to remain constantly open to new mathematical models of machine learning. It’s common to forget the flaws of the model that you are most familiar with in evaluating other models while the flaws of new models get exaggerated. The best way to avoid this is simply education.
4. The value of theory alone is more limited than many theoreticians may be aware. Theories need to be tested to see if they correctly predict the underlying phenomena.
Here is a summary what is wrong with various frameworks for learning. To avoid being entirely negative, I added a column about what’s right as well.
|
相关文章推荐
- 所有的机器学习模型都有缺陷(by John langford)
- 所有的机器学习模型都有缺陷(by John langford)(zz)
- 数学之美 系列十六(上) 不要把所有的鸡蛋放在一个篮子里 -- 谈谈最大熵模型(转载)
- 转载:已经证实提高机器学习模型准确率的八大方法
- 数学之美 系列十六 (下)- 不要把所有的鸡蛋放在一个篮子里 最大熵模型(转载)
- 机器学习 模型缺陷
- [转载] Windows 7 Registry Forensics by John J. Barbara
- 机器学习二分类问题模型效果度量方法
- 机器学习中的数学(3)-模型组合(Model Combining)之Boosting与Gradient Boosting
- 面试应注意的问题-by JohnPhilips(转自matrix论坛)
- Entity Framework 4中删除所有数据行的几种方法【转载】
- 【转载】树状数组求区间和的一些常见模型
- Socket所有的网络通信模型详解共享
- 声明,本人发的所有博客为转载
- 机器学习—最大熵模型_改进迭代尺度法IIS_python实现
- [转载]删除所有的.svn文件夹
- 转载:tensorflow保存训练后的模型
- 机器学习与计算机视觉大牛族谱(转载)
- 为什么一些机器学习模型需要对数据进行归一化?
- 转载:MS SQL Server 获得所有表的表记录,和使用空间的SQL