您的位置：首页 > 运维架构

sqoop的使用过程出现的问题

2017-06-09 12:33 447 查看

想自己搭建Hadoop，hive，用sqoop传数据的记录

下面是sqoop搭建好要做各种测试

1、远程或访问本地的数据库（mysql）

查询全部的数据库名

sqoop list-databases --connectjdbc:mysql://192.168.153.1:3306/ --username root --password mysql

如果这个出现问题，估计就是用户名或远程连接出现问题

远程访问mysql

在mysql上操作可访问的形式，开放访问的ip，端口或者全部开放

例子：表示将 test_db 数据库的所有权限授权给 root 这个用户，允许 root 用户在 192.168.153.128 这个 IP 进行远程登陆，并设置 root 用户的密码为
mysql

grant all PRIVILEGES on hive.* to root@'192.168.153.128' identified
by 'mysql';

执行了上面的语句后，再执行下面的语句，方可立即生效。

FLUSH PRIVILEGES;

当然还有其他的写法,可以自己看看

2、mysql传数据到hdfs（注意要启动Hadoop，因为sqoop的操纵其实是mr做支持的）

sqoop import --connect jdbc:mysql://192.168.153.1:3306/hive --username root --password mysql --tablestudent -m 1

3、hdfs到mysql（但是要注意，文件是以制表符"/t"分开的还是以”/001“，制表符就是tab键分割）

sqoop export--connect jdbc:mysql://192.168.153.1:3306/hive --username root --password mysql--table test --export-dir hdfs://192.168.153.128:9000/user/root/t1.txt/ --input-fields-terminated-by
'\t'

4、mysql到hive（我是这部分遇见了问题）

创建hive表：

sqoop create-hive-table --connect jdbc:mysql://192.168.153.1:3306/hive --username root --password mysql --table student

传数据：
sqoop import --connect jdbc:mysql://192.168.153.1:3306/hive--username root --password mysql --table student --hive-table student--hive-import

1、 非本地mysql问题：https://my.oschina.net/u/204498/blog/522772

2、在hive路径/conf下运行sqoop命令就没有问题，但是在其他文件目录下运行就不行

3、还有缺主键的问题

Error during import: No primary key could be found for tablescore. Please specify one with --split-by or perform a sequential import with'-m 1'.
缺主键造成的。

4、对于没有主键的表可以设置--split-by s_id Sqoop根据不同的split-by参数值来进行切分,然后将切分出来的区域分配到不同map中。每个map中再处理数据库中获取的一行一行的值，写入到HDFS中。同时split-by根据不同的参数类型有不同的切分方法

设置s_id为切分字段

5、hive到mysql

sqoop export --connect "jdbc:mysql://192.168.153.1:3306/hive?useUnicode=true&characterEncoding=utf-8" --username root --password mysql --table test1_all --export-dir /user/hive/warehouse/test1_all --input-fields-terminated-by '\001'

hive：是类sql，它实际的数据存储在hdfs上，hdfs是文本文件，hive保存对应文本的读取格式及各个字段的含义，即hive只存储了hdfs上对应文件的读取方式。所以如果中途测试失败，不仅要看看hive里面有对应的表描述信息，还要看hdfs 上是否存在改表的数据文件。如果测试可以都删除。

hive的查看方式

在命令中直接输入hive

或者 hive -e “sql语句”

hive常用命令：

show tables；

quit;

select * from student；//student表明

//删除表

drop table student；

hdfs 文件的默认地址：/user/root/

hdfs 常用命令

hadoop fs -ls

Hadoop fs -rm -r /user/root/student

hadoop fs -text /user/root/student

6：shell脚本编程：

上面的测试都完成，当要执行一个任务定时将mysql的数据定时抽取到hive，做简单的处理，再保存到mysql中

mkdir 文件夹test

touch test1：创建hive表及导入数据（其实导入数据要加分区的，先简单的写个例子）

#!/bin/bash
sqoop create-hive-table --connect jdbc:mysql://192.168.153.1:3306/hive --table student --username root --password mysql --hive-table student

sq_1=$?
if [$sq_1 -ne 0]
then
echo "[error] create student table failed! "
else
sqoop  import --connect jdbc:mysql://192.168.153.1:3306/hive --username root --password mysql --table student --hive-table student --hive-import
sq_2=$?
if [$sq_2 -ne 0]
then
echo "[error] import student data failed!"
fi

fi

touch test2：创建新表加工数据到新表

#!/bin/bash
sql=$(cat <<!EOF

drop table if exists test1_all;
create table test1_all
( s_id int,
s_name STRING,
score_n DOUBLE
);

INSERT INTO table test1_all
select stu.id,stu.name,sc.num from (
(SELECT
s_id,
sum(score_n) as num
FROM
score
GROUP BY
s_id
)sc
LEFT OUTER
JOIN
(
SELECT
*
FROM
student
) stu
ON sc.s_id = stu.id
)

!EOF)
############  execute begin   ###########
echo $sql
$HIVE_HOME/bin/hive -e "$sql"

touch test3 ：数据从hive导入到mysql

#!/bin/bash

sqoop export --connect "jdbc:mysql://192.168.153.1:3306/hive?useUnicode=true&characterEncoding=utf-8" --username root --password mysql --table test1_all --export-dir /user/hive/warehouse/test1_all --input-fields-terminated-by '\001'

exitCode=$?
if [ $exitCode -ne 0 ];then
echo "[ERROR] hive to mysql execute failed!"
exit $exitCode
fi

touch testall 让这几个文件按顺序执行，上一个阶段执行成功，下一阶段才能执行

#!/bin/bash

./test1

sig_1=$?
if [$sig_1 -ne 0]
then
echo "[error] mysql into hive failed!"
echo $sig_1
else
./test2
sig_2=$?
if [$sig_2 -ne 0 ]
then
echo "[ERROR] hive execute failed!"
echo  $sig_2
else
./test3
sig_3=$?
if [$sig_3 -ne 0 ]
then
echo "[ERROR] hive into mysql  failed!"
echo  $sig_3
fi
fi
fi

最后一步要设置定时任务使用crontab

crontab -e

编写一下，设置执行时间，当然执行时间一般设置一天执行一次

"0 0 1 * * ?" /data/test/testall

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航