您的位置:首页 > 运维架构

Sqoop1.4.4在Hadoop2.2.0集群上的安装

2015-01-18 12:36 337 查看
问题导读:

1、Sqoop在Hadoop与关系型数据库之间传输数据,需要修改哪个配置文件?

2、需要将对应的关系型数据库JDBC驱动包拷贝到哪个目录下?

一、Sqoop1.4.4简介

Sqoop是一个在Hadoop与关系型数据库之间传输数据的工具。我们可以使用Sqoop将关系型数据库(如MySQL、Oracle等)中的数据导入到Hadoop的HDFS(Hadoop分布式文件系统)中,传输数据使用Hadoop的MapReduce并行计算机框架,也可以将数据从HDFS中导出到关系型数据库中。

现在Sqoop2.X已经出来,在安全性、并发性等方面比Sqoop1.X都要好,但是支持的功能有限,此处我们还是使用Sqoop1.X来学习使用Sqoop在关系型数据库与Hadoop之间进行数据的导入导出。下面介绍Sqoop1.4.4安装在Hadoop2.2.0集群上:

二、Sqoop1.4.4安装包下载解压

在Sqoop官网中下载Sqoop1.4.4安装包:sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz,在指定目录解压,如下:

[hadoopUser@secondmgt sqoop1.0]$ ls
sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz
[hadoopUser@secondmgt sqoop1.0]$ tar -zxvf sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz


三、配置环境变量

在用户家目录.bashrc中配置Sqoop1.4.4的环境变量,方便命令的执行

#Sqoop1.4.4 Configure
export SQOOP_HOME=/home/hadoopUser/cloud/sqoop1.0/sqoop-1.4.4.bin__hadoop-2.0.4-alpha
export PATH=$PATH:$SQOOP_HOME/bin
四、检查环境变量是否配置成功

[hadoopUser@secondmgt ~]$ sqoop help
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
usage: sqoop COMMAND [ARGS]

Available commands:
codegen            Generate code to interact with database records
create-hive-table  Import a table definition into Hive
eval               Evaluate a SQL statement and display the results
export             Export an HDFS directory to a database table
help               List available commands
import             Import a table from a database to HDFS
import-all-tables  Import tables from a database to HDFS
job                Work with saved jobs
list-databases     List available databases on a server
list-tables        List available tables in a database
merge              Merge results of incremental imports
metastore          Run a standalone Sqoop metastore
version            Display version information

See 'sqoop help COMMAND' for information on a specific command.
五、配置sqoop-env.sh

从sqoop-env-template.sh复制一份重命名为sqoop-env.sh文件。编辑里面内容

# Set Hadoop-specific environment variables here.

#Set path to where bin/hadoop is available
#export HADOOP_COMMON_HOME=
export HADOOP_COMMON_HOME=/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0

#Set path to where hadoop-*-core.jar is available
#export HADOOP_MAPRED_HOME=/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/mapreduce

#set the path to where bin/hbase is available
#export HBASE_HOME=

#Set the path to where bin/hive is available
#export HIVE_HOME=

#Set the path for where zookeper config dir is
#export ZOOCFGDIR=

HADOOP_COMMON_HOME:填写Hadoop的安装根目录

HADOOP_MAPRED_HOME:填写MapReduce的目录

HBASE_HOME:HBase的安装根目录。(此处暂时用不到可以先不填)

HIVE_HOME:Hive的安装根目录。(此处暂时用不到可以先不填)

六、将Mysql JDBC包拷贝到lib下

[hadoopUser@secondmgt lib]$ ls
ant-contrib-1.0b3.jar       avro-ipc-1.5.3.jar     hsqldb-1.8.0.10.jar           jopt-simple-3.2.jar                  snappy-java-1.0.3.2.jar
ant-eclipse-1.0-jvm1.2.jar  avro-mapred-1.5.3.jar  jackson-core-asl-1.7.3.jar    mysql-connector-java-5.1.18-bin.jar
avro-1.5.3.jar              commons-io-1.4.jar     jackson-mapper-asl-1.7.3.jar  paranamer-2.3.jar
此处我们使用的是MySQL 数据库,如果是Oracle数据库,导入Oracle对应的JDBC包即可。

七、启动测试验证

Sqoop1.0无需启动即可使用,我们使用一条命令来查看是否配置正确,如下:

[hadoopUser@secondmgt ~]$ sqoop list-databases --connect jdbc:mysql://secondmgt:3306/ --password hive --username hive
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
15/01/17 20:07:39 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/01/17 20:07:39 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
goodseval
hive
mysql
spice
sqoopdb
test
--connect jdbc:mysql://secondmgt:3306/ --password hive --username hive 是MySQL数据库的连接命令。

而我的MySQL中的数据库如下:

hadoopUser@secondmgt ~]$ mysql -uhive -phive
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 410
Server version: 5.1.73 Source distribution

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| goodseval          |
| hive               |
| mysql              |
| spice              |
| sqoopdb            |
| test               |
+--------------------+
7 rows in set (0.00 sec)
由结果看,和Sqoop查询得到的结果一致,所以Sqoop安装成功。

推荐阅读:

下一篇:使用Sqoop1.4.4将MySQL数据库表中数据导入到HDFS中
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: