您的位置:首页 > 其它

hive 实践练习1 建表 查询

2017-08-24 18:10 417 查看
head -n 5 visits_data.txt

cat visits.hive

[root@master exercise]#
head -n 5 visits_data.txt
BUCKLEY SUMMER 10/12/2010 14:48 10/12/2010 14:45 WH

CLOONEY GEORGE 10/12/2010 14:47 10/12/2010 14:45 WH

PRENDERGAST JOHN 10/12/2010 14:48 10/12/2010 14:45 WH

LANIER JAZMIN 10/13/2010 13:00 WH BILL SIGNING/
MAYNARD ELIZABETH 10/13/2010 12:34 10/13/2010 13:00 WH BILL SIGNING/

[root@master exercise]# cat visits.hive
--cat visits.hive
create table people_visits (
last_name string,
first_name string,
arrival_time string,
scheduled_time string,
meeting_location string,
info_comment string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t' ;

建表语句不知道什么意思

[root@master ~]# ./hive -f /opt/visits.hive
bash: ./hive: No such file or directory

hive> ./hive -f /opt/exercise/visits.hive
> ;
NoViableAltException(17@[])

hive> -f /opt/exercise/visits.hive
> ;
NoViableAltException(299@[])
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1074)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:397)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:309)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1145)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1193)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
FAILED: ParseException line 1:1 cannot recognize input near '-' 'f' '/'

[root@master exercise]# hive -f /opt/exercise/visits.hive

Logging initialized using configuration in jar:file:/opt/apache-hive-1.2.2-bin/lib/hive-common-1.2.2.jar!/hive-log4j.properties
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct MetaStore DB connections, we don't support
retries at the client level.)

[root@master ~]# hive

Logging initialized using configuration in jar:file:/opt/apache-hive-1.2.2-bin/lib/hive-common-1.2.2.jar!/hive-log4j.properties
hive (default)> show tables;
OK
tab_name
people_visits
Time taken: 1.281 seconds, Fetched: 1 row(s)
hive (default)> describe people_visits;
OK
col_name data_type comment
last_name string

first_name string

arrival_time string

scheduled_time string

meeting_location string

info_comment string

Time taken: 0.544 seconds, Fetched: 6 row(s)

[root@master exercise]# hadoop fs -put visits_data.txt /data/hive/warehouse/people_visits
put: `/data/hive/warehouse/people_visits/': No such file or directory

(You are getting the error, because there is no such directory specified in the path. Please take a look at my
answer to a similar question which explains how hadoop interprets relative path's.
Make sure you create the directory first using:
bin/hadoop fs -mkdir input
and then try to re-execute the command -put.)

[root@master data]# hadoop fs -mkdir /hive
--OK

[root@master data]# hadoop fs -mkdir /data/hive/warehouse/people_visits
mkdir: `/data/hive/warehouse/people_visits': No such file or directory

查看hadoop 目录结构 ? 不知道如何建立hadoop 子目录结构
hadoop fs -put visits_data.txt /hive

17/08/23 10:56:03 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hive/visits_data.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication
(=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.

put: File /hive/visits_data.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s)
are excluded in this operation.

首先确认已经关闭防火墙,然后发现,我用master,slave两个机器,配置
dfs.replication 属性为2 ,修改成1 搞定。
<property>
<name>dfs.replication</name>
<value>1</value>
</property>

[root@master exercise]# hadoop fs -ls /hive
Found 1 items
-rw-r--r-- 2 root supergroup 989239 2017-08-23 11:04 /hive/visits_data.txt

hive (default)> select * from people_visits limit 5;
OK
people_visits.last_name people_visits.first_name people_visits.arrival_time people_visits.scheduled_time people_visits.meeting_location people_visits.info_comment
Time taken: 2.319 seconds

放入的数据找不到!!! 是不是配置问题?

需要配置

<property>  
  <name>hive.metastore.warehouse.dir</name>  
  <value>/user/hive/warehouse</value>  
</property>  

然后执行
[root@master exercise]# hadoop fs -put visits_data.txt /user/hive/warehouse/employees
You have new mail in /var/spool/mail/root

[root@master exercise]#
hadoop fs -ls /user/hive/warehouse/people_visits
Found 1 items
-rw-r--r-- 2 root supergroup 989239 2017-08-24 15:08 /user/hive/warehouse/people_visits/visits_data.txt

hive (default)> select * from people_visits limit 5;
OK
people_visits.last_name people_visits.first_name people_visits.arrival_time people_visits.scheduled_time people_visits.meeting_location people_visits.info_comment
BUCKLEY SUMMER 10/12/2010 14:48 10/12/2010 14:45 WH

CLOONEY GEORGE 10/12/2010 14:47 10/12/2010 14:45 WH

PRENDERGAST JOHN 10/12/2010 14:48 10/12/2010 14:45 WH

LANIER JAZMIN 10/13/2010 13:00 WH BILL SIGNING/
MAYNARD ELIZABETH 10/13/2010 12:34 10/13/2010 13:00 WH BILL SIGNING/
Time taken: 0.631 seconds, Fetched: 5 row(s

hive (default)> select count(*) from people_visits;
Query ID = root_20170824153813_390ff406-b9bb-4b83-99bf-a2f5bf9092dc
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1503560054683_0001, Tracking URL = http://master:8088/proxy/application_1503560054683_0001/ Kill Command = /opt/hadoop-2.6.5/bin/hadoop job -kill job_1503560054683_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2017-08-24 15:38:39,084 Stage-1 map = 0%, reduce = 0%
2017-08-24 15:38:47,164 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.88 sec
2017-08-24 15:38:53,717 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 5.32 sec
MapReduce Total cumulative CPU time: 5 seconds 320 msec
Ended Job = job_1503560054683_0001
MapReduce Jobs Launched:

Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 5.32 sec HDFS Read: 996386 HDFS Write: 6 SUCCESS
Total MapReduce CPU Time Spent: 5 seconds 320 msec
OK
_c0
17977
Time taken: 43.436 seconds, Fetched: 1 row(s)
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: