方法、脚本-Pig Grunt之简单命令及实例说明-by小雨
2013-04-17 13:21
701 查看
首先声明,我是一个菜鸟。一下文章中出现技术误导情况盖不负责
Pig大的运行方法:
1、脚本
2、Grunt
3、嵌入式
Grunt
1、主动补全制机 (令命补全、不支持文件名补全)
2、autocomplete文件
3、Eclipse件插PigPen
进入Grunt shell令命
[hadoop@master pig]$ ./bin/pig
2013-04-13 23:00:19,909 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0 (r1328203) compiled Apr 19 2012, 22:54:12
2013-04-13 23:00:19,909 [main] INFO org.apache.pig.Main - Logging error messages to: /opt/pig/pig_1365865219902.log
2013-04-13 23:00:20,237 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://192.168.154.100:9000
2013-04-13 23:00:20,536 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 192.168.154.100:9001
帮助(help)
grunt> help
Commands:
<pig latin statement>; - See the PigLatin manual for details: http://hadoop.apache.org/pig File system commands:
fs <fs arguments> - Equivalent to Hadoop dfs command: http://hadoop.apache.org/common/docs/current/hdfs_shell.html Diagnostic commands:
describe <alias>[::<alias] - Show the schema for the alias. Inner aliases can be described as A::B.
explain [-script <pigscript>] [-out <path>] [-brief] [-dot] [-param <param_name>=<param_value>]
[-param_file <file_name>] [<alias>] - Show the execution plan to compute the alias or for entire script.
-script - Explain the entire script.
-out - Store the output into directory rather than print to stdout.
-brief - Don't expand nested plans (presenting a smaller graph for overview).
-dot - Generate the output in .dot format. Default is text format.
-param <param_name - See parameter substitution for details.
-param_file <file_name> - See parameter substitution for details.
alias - Alias to explain.
dump <alias> - Compute the alias and writes the results to stdout.
Utility Commands:
exec [-param <param_name>=param_value] [-param_file <file_name>] <script> -
Execute the script with access to grunt environment including aliases.
-param <param_name - See parameter substitution for details.
-param_file <file_name> - See parameter substitution for details.
script - Script to be executed.
run [-param <param_name>=param_value] [-param_file <file_name>] <script> -
Execute the script with access to grunt environment.
-param <param_name - See parameter substitution for details.
-param_file <file_name> - See parameter substitution for details.
script - Script to be executed.
sh <shell command> - Invoke a shell command.
kill <job_id> - Kill the hadoop job specified by the hadoop job id.
set <key> <value> - Provide execution parameters to Pig. Keys and values are case sensitive.
The following keys are supported:
default_parallel - Script-level reduce parallelism. Basic input size heuristics used by default.
debug - Set debug on or off. Default is off.
job.name - Single-quoted name for jobs. Default is PigLatin:<script name>
job.priority - Priority for jobs. Values: very_low, low, normal, high, very_high. Default is normal
stream.skippath - String that contains the path. This is used by streaming.
any hadoop property.
help - Display this message.
quit - Quit the grunt shell.
查看(ls、cd 、cat)
grunt> ls
hdfs://192.168.154.100:9000/user/hadoop/in <dir>
hdfs://192.168.154.100:9000/user/hadoop/out <dir>
grunt> cd in
grunt> ls
hdfs://192.168.154.100:9000/user/hadoop/in/test1.txt<r 1> 12
hdfs://192.168.154.100:9000/user/hadoop/in/test2.txt<r 1> 13
hdfs://192.168.154.100:9000/user/hadoop/in/test_1.txt<r 1> 328
hdfs://192.168.154.100:9000/user/hadoop/in/test_2.txt<r 1> 139
grunt> cat test1.txt
hello world
复制到当地(copyToLocal)
grunt> ls
hdfs://192.168.154.100:9000/user/hadoop/in/test1.txt<r 1> 12
hdfs://192.168.154.100:9000/user/hadoop/in/test2.txt<r 1> 13
hdfs://192.168.154.100:9000/user/hadoop/in/test_1.txt<r 1> 328
hdfs://192.168.154.100:9000/user/hadoop/in/test_2.txt<r 1> 139
grunt> copyToLocal test1.txt ttt
[root@master pig]# ls -l ttt
-rwxrwxrwx. 1 hadoop hadoop 12 4月 13 23:06 ttt
[root@master pig]#
行执操纵系统令命:sh
grunt> sh jps
2098 DataNode
1986 NameNode
2700 Jps
2539 RunJar
2297 JobTracker
2211 SecondaryNameNode
2411 TaskTracker
grunt>
Pig数据模型
Bag:表
Tuple:行,录记
Field:性属
Pig不要求同一个bag面里的各个tuple有雷同数量或雷同类型的field
Pig latin经常使用语句
LOAD:指出载入数据的方法
FOREACH:逐行扫描行进某种处置
FILTER:滤过行
DUMP:把结果表现到屏幕
STORE:把结果保存到文件
pig脚本实例:
![](http://img.my.csdn.net/uploads/201304/13/1365837715_5891.png)
grunt> records = LOAD 'input/ncdc/micro-tab/sample.txt'
>> AS (year:chararray, temerature:int, quality:int);
![](http://img.my.csdn.net/uploads/201304/13/1365838181_3084.png)
![](http://img.my.csdn.net/uploads/201304/13/1365838258_5852.png)
![](http://img.my.csdn.net/uploads/201304/13/1365838424_8854.png)
![](http://img.my.csdn.net/uploads/201304/13/1365838576_6233.png)
![](http://img.my.csdn.net/uploads/201304/13/1365838973_1658.png)
(1949,111)
(1950,22)
由此功成算计年每的最高气温。
实例二:
![](http://img.my.csdn.net/uploads/201304/13/1365839521_7328.png)
![](http://img.my.csdn.net/uploads/201304/13/1365839530_2206.png)
文章结束给大家分享下程序员的一些笑话语录:
看到有人回帖“不顶不是中国人”,他的本意是想让帖子沉了。
Pig大的运行方法:
1、脚本
2、Grunt
3、嵌入式
Grunt
1、主动补全制机 (令命补全、不支持文件名补全)
2、autocomplete文件
3、Eclipse件插PigPen
进入Grunt shell令命
[hadoop@master pig]$ ./bin/pig
2013-04-13 23:00:19,909 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0 (r1328203) compiled Apr 19 2012, 22:54:12
2013-04-13 23:00:19,909 [main] INFO org.apache.pig.Main - Logging error messages to: /opt/pig/pig_1365865219902.log
2013-04-13 23:00:20,237 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://192.168.154.100:9000
2013-04-13 23:00:20,536 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 192.168.154.100:9001
帮助(help)
grunt> help
Commands:
<pig latin statement>; - See the PigLatin manual for details: http://hadoop.apache.org/pig File system commands:
fs <fs arguments> - Equivalent to Hadoop dfs command: http://hadoop.apache.org/common/docs/current/hdfs_shell.html Diagnostic commands:
describe <alias>[::<alias] - Show the schema for the alias. Inner aliases can be described as A::B.
explain [-script <pigscript>] [-out <path>] [-brief] [-dot] [-param <param_name>=<param_value>]
[-param_file <file_name>] [<alias>] - Show the execution plan to compute the alias or for entire script.
-script - Explain the entire script.
-out - Store the output into directory rather than print to stdout.
-brief - Don't expand nested plans (presenting a smaller graph for overview).
-dot - Generate the output in .dot format. Default is text format.
-param <param_name - See parameter substitution for details.
-param_file <file_name> - See parameter substitution for details.
alias - Alias to explain.
dump <alias> - Compute the alias and writes the results to stdout.
Utility Commands:
exec [-param <param_name>=param_value] [-param_file <file_name>] <script> -
Execute the script with access to grunt environment including aliases.
-param <param_name - See parameter substitution for details.
-param_file <file_name> - See parameter substitution for details.
script - Script to be executed.
run [-param <param_name>=param_value] [-param_file <file_name>] <script> -
Execute the script with access to grunt environment.
-param <param_name - See parameter substitution for details.
-param_file <file_name> - See parameter substitution for details.
script - Script to be executed.
sh <shell command> - Invoke a shell command.
kill <job_id> - Kill the hadoop job specified by the hadoop job id.
set <key> <value> - Provide execution parameters to Pig. Keys and values are case sensitive.
The following keys are supported:
default_parallel - Script-level reduce parallelism. Basic input size heuristics used by default.
debug - Set debug on or off. Default is off.
job.name - Single-quoted name for jobs. Default is PigLatin:<script name>
job.priority - Priority for jobs. Values: very_low, low, normal, high, very_high. Default is normal
stream.skippath - String that contains the path. This is used by streaming.
any hadoop property.
help - Display this message.
quit - Quit the grunt shell.
查看(ls、cd 、cat)
grunt> ls
hdfs://192.168.154.100:9000/user/hadoop/in <dir>
hdfs://192.168.154.100:9000/user/hadoop/out <dir>
grunt> cd in
grunt> ls
hdfs://192.168.154.100:9000/user/hadoop/in/test1.txt<r 1> 12
hdfs://192.168.154.100:9000/user/hadoop/in/test2.txt<r 1> 13
hdfs://192.168.154.100:9000/user/hadoop/in/test_1.txt<r 1> 328
hdfs://192.168.154.100:9000/user/hadoop/in/test_2.txt<r 1> 139
grunt> cat test1.txt
hello world
复制到当地(copyToLocal)
grunt> ls
hdfs://192.168.154.100:9000/user/hadoop/in/test1.txt<r 1> 12
hdfs://192.168.154.100:9000/user/hadoop/in/test2.txt<r 1> 13
hdfs://192.168.154.100:9000/user/hadoop/in/test_1.txt<r 1> 328
hdfs://192.168.154.100:9000/user/hadoop/in/test_2.txt<r 1> 139
grunt> copyToLocal test1.txt ttt
[root@master pig]# ls -l ttt
-rwxrwxrwx. 1 hadoop hadoop 12 4月 13 23:06 ttt
[root@master pig]#
行执操纵系统令命:sh
grunt> sh jps
2098 DataNode
1986 NameNode
2700 Jps
2539 RunJar
2297 JobTracker
2211 SecondaryNameNode
2411 TaskTracker
grunt>
Pig数据模型
Bag:表
Tuple:行,录记
Field:性属
Pig不要求同一个bag面里的各个tuple有雷同数量或雷同类型的field
Pig latin经常使用语句
LOAD:指出载入数据的方法
FOREACH:逐行扫描行进某种处置
FILTER:滤过行
DUMP:把结果表现到屏幕
STORE:把结果保存到文件
pig脚本实例:
![](http://img.my.csdn.net/uploads/201304/13/1365837715_5891.png)
grunt> records = LOAD 'input/ncdc/micro-tab/sample.txt'
>> AS (year:chararray, temerature:int, quality:int);
![](http://img.my.csdn.net/uploads/201304/13/1365838181_3084.png)
![](http://img.my.csdn.net/uploads/201304/13/1365838258_5852.png)
![](http://img.my.csdn.net/uploads/201304/13/1365838424_8854.png)
![](http://img.my.csdn.net/uploads/201304/13/1365838576_6233.png)
![](http://img.my.csdn.net/uploads/201304/13/1365838973_1658.png)
(1949,111)
(1950,22)
由此功成算计年每的最高气温。
实例二:
![](http://img.my.csdn.net/uploads/201304/13/1365839521_7328.png)
![](http://img.my.csdn.net/uploads/201304/13/1365839530_2206.png)
文章结束给大家分享下程序员的一些笑话语录:
看到有人回帖“不顶不是中国人”,他的本意是想让帖子沉了。
相关文章推荐
- Pig Grunt之简单命令及实例说明
- 脚本、视图-最简单的查询表空间的使用量、剩余量的方法-by小雨
- ssh远程执行命令方法和Shell脚本实例
- 【Java 线程的深入研究3】最简单实例说明wait、notify、notifyAll的使用方法
- Redis(Windows安装方法与Java调用实例 & 配置文件参数说明 & Java使用Redis所用Jar包 & Redis与Memcached区别 & redis-cli.exe命令及示例)
- ssh远程执行命令方法和Shell脚本实例
- subplots与figure函数参数解释说明以及简单的使用脚本实例
- Pig Grunt之简单命令及实例说明
- Nodejs中调用系统命令、Shell脚本和Python脚本的方法和实例
- Nodejs中调用系统命令、Shell脚本和Python脚本的方法和实例
- Nodejs中调用系统命令、Shell脚本和Python脚本的方法和实例
- Nodejs中调用系统命令、Shell脚本和Python脚本的方法和实例
- C#中类的构造方法的简单说明
- Shell常识--基本函数和简单命令rev--总结自《Linux Shell 脚本攻略》
- openSUSE下开机自动运行脚本命令的方法
- ES6使用let命令更简单的实现块级作用域实例分析
- linux中目录信息查询ls命令的简单使用方法
- 关于使用Spring和hibernate开发web程序的配置说明和简单实例的详细说明
- Openresty服务器使用lua脚本写的Hello World简单实例
- shell脚本--cut命令与awk简单使用