您的位置:首页 > 其它

mahout kmeans 例子

2015-12-01 12:41 459 查看
一、mahout 简单例子测试

mahout 安装配置可以参考:mahout安装配置

1、kmeans 聚类算法测试数据来源:

地址:http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data

2、下载数据,把数据存放到hdfs上(hadoop2.6.1 已经启动)

创建测试目录testdata,并把数据导入到这个tastdata目录中(这里的目录的名字只能是testdata)

$ hdfs dfs -mkdir testdata

$ hdfs dfs -put /home/lin/hadoop/mahout-distribution-0.10.0/test.data  testdata

[/code]

3、执行kmeans算法,等待运行结果

$ hadoop jar /home/lin/hadoop/mahout-distribution-0.10.0/mahout-examples-0.10.0-job.jar org.apache.mahout.clustering.syntheticcontrol.kmeans.Job

[/code]

4、运行成功查看运行结果

hdfs dfs -ls output

[/code]

显示如下结果证明运行成功:

lin@lin162:~/hadoop/hadoop-2.6.1/etc/hadoop$ hdfs dfs -ls output

Found 15 items

-rw-r--r--   2 lin supergroup        194 2015-12-01 12:27 output/_policy

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:27 output/clusteredPoints

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:22 output/clusters-0

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:23 output/clusters-1

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:27 output/clusters-10-final

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:23 output/clusters-2

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:24 output/clusters-3

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:24 output/clusters-4

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:25 output/clusters-5

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:25 output/clusters-6

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:25 output/clusters-7

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:26 output/clusters-8

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:26 output/clusters-9

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:22 output/data

drwxr-xr-x   - lin supergroup          0 2015-12-01 12:22 output/random-seeds

[/code]
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: