您的位置:首页 > 编程语言 > Python开发

k最邻近算法-KNN,及python3 实例代码

2016-12-04 17:59 399 查看
刚读了《machine learning in action》的KNN算法。

K最近邻算法(kNN,k-NearestNeighbo),即计算到每个样本的距离,选取前k个。从前k个选择出大多数属于的class来进行分类,以下特点:

1. 简单,无需训练

2. 样本数量不平衡时, 对‘最邻近,大多数’这样的规则,明显样本数量多的分类占优势
3. 计算到全部样本的距离,计算量大

书中给出的第一个实例代码如下,原书中是python2的,下面改为python3 (仅对一行代码进行了修改):

'''

first case of KNN classifer

'''
from numpy import *
import operator

def createDataSet():
group = array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])
labels = ['A','A','B','B']
return (group,labels)

def classify0(inX, dataSet, labels, k):
dataSetSize = dataSet.shape[0]
diffMat = tile(inX, (dataSetSize,1))-dataSet
sqDiffMat = diffMat**2
sqDistances = sqDiffMat.sum(axis=1)
distances = sqDistances**0.5
sortedDistIndicies = distances.argsort()
classCount={}
for i in range(k):
voteIlabel = labels[sortedDistIndicies[i]]
classCount[voteIlabel] = classCount.get(voteIlabel,0) + 1
# change itemgetter to item
sortedClassCount = sorted(classCount.items(),key=operator.itemgetter(1), reverse=True)
return sortedClassCount[0][0]

if __name__=='__main__':
print ('dataset - labels')
print(createDataSet())
group,labels = createDataSet()
label = classify0([1,1.3],group,labels,3)
print (label)
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: