machine learning in coding(python):根据关键字合并多个表(构建组合feature)
2015-08-02 17:14
876 查看
三张表;train_set.csv;test_set.csv;feature.csv。三张表通过object_id关联。
import pandas as pd import numpy as np # load training and test datasets train = pd.read_csv('../input/train_set.csv') test = pd.read_csv('../input/test_set.csv') features = pd.read_csv('../input/feature.csv') train = pd.merge(train,features,on='object_id',how='inner') test = pd.merge(test,features,on='object_id',how='inner') # drop useless columns and create labels test = test.drop(['id', 'object_id'], axis = 1) labels = train.cost.values train = train.drop(['object_id', 'cost'], axis = 1) # convert data to numpy array train = np.array(train) test = np.array(test) from:kaggle
相关文章推荐
- python_学习笔记0802
- python科学计算_numpy_ndarray
- python3.4 GUI
- 【Python】循环设计
- python抓包解包
- 用python加cPAMIE加pyinstaller为我柱哥点赞
- wxpython 32 位 ,python 64 位问题
- wxpython 32 位 ,python 64 位问题
- 【Python】模块
- urllib2模块
- [python]解析python打印出来的数组
- python 中 BeautifulSoup 模块
- Python中的元类
- k最近邻算法(KNN)的简介和python实现
- python正则表达式
- [转载] sublime text 2 调试python时结果空白
- Python的import语法替代方案
- windows 下Python import 导入自定义模块
- 用PersonalRank实现基于图的推荐算法(python实现)
- python中如何用正则表达式匹配汉字