Kaggle入门赛之Digit Recognizer
2017-03-22 12:29
393 查看
题目大意:手写数字的识别。每个数字由28*28的像素矩阵表示,也就是784个像素点。每个像素点的值between 0 and 255。
思路:knn在数字识别方面表现比较好,因为特征维数过多,kd_tree比较慢,所以我采用的是基于ball_tree的knn。每个像素的值都归一化,非0值都变成1。
工具:py2.7,sklearn,pycharm
大概跑了半小时就出来了,这是我的分数
思路:knn在数字识别方面表现比较好,因为特征维数过多,kd_tree比较慢,所以我采用的是基于ball_tree的knn。每个像素的值都归一化,非0值都变成1。
工具:py2.7,sklearn,pycharm
# -*- coding: utf-8 -*- import csv import numpy as ny from sklearn.neighbors import KNeighborsClassifier def to_int(list): n=len(list) for i in range(n): list[i]=int(list[i]) return list # 归一化 def normalize(array): n,m=array.shape for i in range(n): for j in range(m): if array[i,j]!=0: array[i,j]=1 return array # 读取训练集 def load_train_data(): train_data=[] train_label=[] with open('E:\\data\\kaggle\\digit recognizer\\train.csv','rb') as file: lines=csv.reader(file) header=True for line in lines: if header: header=False continue train_label.append(int(line[0])) train_data.append(to_int(line[1:])) return normalize(ny.array(train_data)),ny.array(train_label) # 读取测试集 def load_test_data(): test_data=[] with open('E:\\data\\kaggle\\digit recognizer\\test.csv','rb') as file: lines=csv.reader(file) header=True for line in lines: if header: header=False continue test_data.append(to_int(line)) return normalize(ny.array(test_data)) def classify(): train_data,train_label=load_train_data() test_data=load_test_data() neigh=KNeighborsClassifier(algorithm='ball_tree') neigh.fit(train_data,train_label) result=[] result.append(('ImageId','Label')) i=1 for item in test_data: label=neigh.predict(ny.array(item).reshape((1,-1))) result.append((i,label[0])) i+=1 with open('E:\\data\\kaggle\\digit recognizer\\result.csv','wb') as file: writer=csv.writer(file) writer.writerows(result) classify()
大概跑了半小时就出来了,这是我的分数
相关文章推荐
- Kaggle入门:Digit Recognizer
- 转: Kaggle入门模板:以手写识别Digit Recognizer为例
- kaggle 入门 digit recognizer Python xgboost
- kaggle 入门 digit recognizer python randomForestClassifier
- Kaggle入门模板:以手写识别Digit Recognizer为例
- Kaggle入门:Digit Recognizer
- Kaggle实战:Digit Recognizer[Random Forest算法]
- tensorflow和Keras 实现kaggle手写识别Digit Recognizer(三)卷积神经网络
- [kaggle实战] Digit Recognizer -- 从KNN,LR,SVM,RF到深度学习
- kaggle--Digit Recognizer(python实现)
- 数字识别[Digit Recognizer](https://www.kaggle.com/c/digit-recognizer)
- Kaggle项目实战1——Digit Recognizer——排名Top10%
- Kaggle Digit Recognizer 基于sklearn实现的手写数字识别 for MNIST data
- 第一个kaggle项目Digit Recognizer
- 学习笔记——Kaggle_Digit Recognizer (朴素贝叶斯 Python实现)
- 学习笔记——Kaggle_Digit Recognizer (SVM算法 Python实现)
- Kaggle实战-最简单的DIGIT RECOGNIZER
- 关于kaggle上的digit recognizer
- kaggle | Digit recognizer with caffe
- 学习笔记——Kaggle_Digit Recognizer (Random Forest算法 Python实现)